I’ve been really interested in improving my data skills, so when I came across the Data Analyst Nanodegree program from Udacity in December I thought I’d give it a shot. I really had fun with it. Here’s my recap of the 6 month (Term 1& Term 2) Data Analyst Nanodegree course:
Term 1 is all about learning Python and it’s powerful data libraries, pandas and MatPlotLib. There’s also some SQL sprinkled in and a nice dose of statistics, but the focus is on the data analysis process with Python. This term was broken down into three parts: an introduction to Python, Data Analysis, and Statistics. The first section, an introduction to python, consisted of learning the basics of Python using Jupiter notebooks. All of the essentials were covered: strings, functions, modules, arithmetic.If you’re familiar with computer programming you’ll breeze through this section. The project was a simple data analysis on bike sharing using Python. Most of the project implementation details were scaffolded out, so it’s left up to the student to fill in the details. Overall, it was a niceintroduction to start. The second section was focused on learning the basics of SQL and investigating datasets using Python. I had a pretty good grasp on SQL before this course, so I didn’t spend very much time on this portion. But I should mention they did a great job covering all of the necessary commands and clearly explained some of the trickier bits, like joins and window functions. This section also went in depth on how to analyze data using Python. Understanding how to clean data with Pandas and plot it with MatPlotLib is the cornerstone of this nanodegree program, nearly everything in Term 2 is built off the foundational knowledge learned here, so don’t skip it if you’re going to be moving on. The project in this section was one of my favorites of the course: aninvestigation into datasets of your choosing. I chose to look at the effects of government health care spending on life expectancy and colon cancer. The last section covered statistics. In it, we covered all the good stuff: regressions, confidence intervals, hypothesis testing, bootstrapping, and Bayes. I found the content to be well taught and thorough, but my biggest complaint about this section was it’s format and pacing. The format of some of the videos and quizzes were noticeably different from everything else. Videos were very, very short which made it more difficult to let the topic flow, and many of the quizzes were unnecessary in the early setup of the problems. Even with its flaws, this was unquestionably one of the more useful sections of the nanodegree.
Term 2 consisted of Exploratory Data Analysis using R,Data Wrangling, and Data Story Telling using Tableau. Upon completion of these three sections, you’re awarded the coveted nanodegree Exploratory Data Analysis using R was the first section up in Term 2. It started off by going through the fundamentals of R, which aren’t that different from a programming language like Python, and learning about R Studio, the environment for development. This was my first exposure to R, and I thought it was fairly easy to pick up. There are a lot of similarities in syntax when analyzing/cleaning data between R and Python+Pandas.I found the main advantage of using R was the graphing tools. I was particularly impressed by the simplicity and power of
ggplot. I found this section interesting, but not incredibly useful as it was essentially a repeat of section 2 of Term 1 except it used a different programming language (onethat is declining in usage). The second section of Term 2 focused on Data Wrangling/Cleaning. Nearly all of this was covered in section 2 of Term 1. The project required gathering, assessing, and cleaning a user’s twitter dataset. Really, nothing new here. The last section of Term 2 was about Data Story Telling by using Tableau. I was really excited about this one but ended up pretty disappointed. I’d heard about Tableau many times when reading or discussing data analysis but had never had the chance to check it out. Its drag-and-drop interface was extremely easy to use and I really think this application could open up data analysis to more people, but I found this to be my biggest problem with it: this taught an application user interface instead of continuing, or reinforcing, data analysis knowledge that goes beyond an easy-to-use interface. I personally would have loved to see this section replaced by a more difficult SQL section or more advanced plotting with Python.
- Udacity does provide nanodegree students with a“mentor.” I didn’t have any real need to use it, but I appreciated knowing that if I did get stuck there was someone I could talk to directly.
- Term 1 would be really difficult for anyone that hasn’t done any computer programming. I’d highly recommend a foundations course before starting the nanodegree if you haven’t programmed before.
- The timelines and due dates were generous. I rarely felt rushed and was able to accomplish nearly all of the course just on weekends.
- Jupyter notebooks are incredible. This was my first time using them and dang do they make learning programming SO nice.
My final advice:
It’s worth it. Term 1 has better material, Term 2 has the certificate. If you’re exclusively looking to improve your data analysis skills, then only taking term 1 is sufficient. If you’re looking to improve your resume and go into the Data Analysis profession, then it’s hard to turn down Term2 and the nanodegree certificate.