The Best Data Science Curriculum — Class Central Career Guides
Editor’s note: Drop us a note at [email protected] if you have any feedback or requests for particular career guides. We are also looking for contributors! Here are the parts of the series that have been published so far: The Best Intro to Programming Courses for Data Science The Best Statistics & Probability Courses for Data … Continued
I started creating my own data science master’s degree using online courses a year and a half ago. I have taken many data science-related courses, including a few machine learning courses, and audited portions of many more. I know the options out there, and what skills are needed for learners preparing for a data analyst or data scientist role.
For each of the guides in this series, I spent several hours trying to identify every online course for the subject in question, extracting key bits of information from their syllabi and reviews, and compiling their ratings. For these tasks, I used Class Central’s community and its database of thousands of course ratings and reviews.
Since 2011, Class Central founder Dhawal Shah has kept a closer eye on online courses than arguably anyone else in the world. Dhawal personally helped me assemble these guides.
About the Data Science Career Guide
Class Central’s Data Science Career Guide is a six-piece series that recommends the best MOOCs for launching yourself into the data science industry. The first five pieces recommend the best courses for several data science core competencies (programming, statistics, the data science process, data visualization, and machine learning). The final piece is a summary of those courses and the best MOOCs for other key topics such as data wrangling, databases, and even software engineering.
P.S. If you are looking for a complete list of Data Science MOOCs, you can find them on Class Central’s Data Science and Big Data subject page.
How We Picked Courses to Consider
Each course within each guide must fit certain criteria. There were subject-specific criteria, then two common ones that each guide shared:
It must be on-demand or offered every few months.
It must be an interactive online course, so no books or read-only tutorials. Though these are viable ways to learn, this guide focuses on courses. Courses that are strictly videos (i.e. with no quizzes, assignments, etc.) are also excluded.
We believe we covered every notable course that fit the criteria in each guide. Since there were seemingly hundreds of courses on Udemy, we chose to consider the most-reviewed and highest-rated ones only. There is always a chance that we missed something, though. Please let us know in the comments section if we left a good course out.
How We Tested
We compiled average ratings and number of reviews from Class Central and other review sites to calculate a weighted average rating for each course. We read text reviews and used this feedback to supplement the numerical ratings.
We made subjective syllabus judgment calls based on a variety of factors specific to each subject. The criteria in our intro to programming guide, for example:
Coverage of the fundamentals of programming.
Coverage of more advanced, but useful, topics in programming.
How much of the syllabus is relevant to data science?
The University of Toronto’s Learn to Program series has an excellent mix of content difficulty and scope for the beginner data scientist. Taught in Python, the series has a 4.71-star weighted average rating over 284 reviews.
Rice University’s Interactive Programming in Python series contains two of the best online courses ever. They skew towards games and interactive applications, which are less applicable topics in data science. The series has a 4.93-star weighted average rating over 6,069 reviews.
The courses in the UT Austin’s Foundations of Data Analysis series are two of the few with great reviews that also teach statistics and probability with a focus on coding up examples. The series has a 4.61-star weighted average rating over 28 reviews.
Duke’s Statistics with R Specialization, which is split into five courses, has a comprehensive syllabus with full sections dedicated to probability. It has a 3.6-star weighted average rating over 5 reviews, but the course it was based upon has a 4.77-star weighted average rating over 60 reviews.
MIT’s Intro to Probability course by far has the highest ratings of the courses considered in the statistics and probability guide. It exclusively probability in great detail, plus it is longer (15 weeks) and more challenging than most MOOCs. It has a 4.82-star weighted average rating over 38 reviews.
Kirill Eremenko’s Data Science A-Z excels in breadth and depth of coverage of the data science process. The instructor’s natural teaching ability is frequently praised by reviewers. It has a 4.5-star weighted average rating over 5,078 reviews.
Big Data University’s Data Science Fundamentals covers the full data science process and introduces Python, R, and several other open-source tools. There are no reviews for this course on the review sites used for this analysis.
A five-course series, UC Davis’ Data Visualization with Tableau Specialization dives deep into visualization theory. Opportunities to practice Tableau are provided through walkthroughs and a final project. It has a 4-star weighted average rating over 2 reviews.
Endorsed by ggplot2 creator Hadley Wickham, a substantial amount of theory is covered in DataCamp’s Data Visualization with ggplot2 series. You will know R and its quirky syntax quite well leaving these courses. There are no reviews for these courses on the review sites used for this analysis.
An effective practical introduction, Kirill Eremenko’s Tableau 10 Series focuses mostly on tool coverage (Tableau) rather than data visualization theory. Together, the two courses have a 4.6-star weighted average rating over 3,724 reviews.
Taught by the famous Andrew Ng, Google Brain founder and former chief scientist at Baidu, Stanford University’s Machine Learning covers all aspects of the machine learning workflow and several algorithms. Taught in MATLAB or Octave, It has a 4.7-star weighted average rating over 422 reviews.
A more advanced introduction than Stanford’s, CoIumbia University’s Machine Learning is a newer course with exceptional reviews and a revered instructor. The course’s assignments can be completed using Python, MATLAB, or Octave. It has a 4.8-star weighted average rating over 10 reviews.
Kirill Eremenko and Hadelin de Ponteves’ Machine Learning A-Z is an impressively detailed offering that provides instruction in both Python and R, which is rare and can’t be said for any of the other top courses. It has a 4.5-star weighted average rating over 8,119 reviews.
Parag Mital’s Creative Applications of Deep Learning with Tensorflow adds a unique twist to a technical subject. The “creative applications” are inspiring, the course is professionally produced, and the instructor knows his stuff. Taught in Python, It has a 4.75-star weighted average rating over 16 reviews.
Learn from a legend. Geoffrey Hinton is known as the “godfather of deep learning” is internationally distinguished for his work on artificial neural nets. His Neural Networks for Machine Learning is an advanced class. Taught in Octave with exercises also in Python, it has a 4.11-star weighted average rating over 35 reviews.
Deep Learning A-Z is an accessible introduction to deep learning, with intuitive explanations from Kirill Eremenko and helpful code demos from Hadelin de Ponteves. Taught in Python, it has a 4.6-star weighted average rating over 1,237 reviews.
Our top course pick for each of the smaller subjects (in terms of course quantity) within data science are listed below:
DataCamp’s code-heavy instruction style and in-browser programming environment are great for learning syntax. Their Python courses have a 4.64-star weighted average rating over 14 reviews. Udacity’s Intro to Data Analysis, one of our recommendations for intro to data science courses, covers NumPy and pandas as well.
Again, DataCamp’s code-heavy instruction style and in-browser programming environment are great for learning syntax. Their R Programming Track, which is also one of our recommendations for programming courses in general, effectively combines programming fundamentals and R syntax instruction. The series has a 4.29-star weighted average rating over 14 reviews.
Stanford University’s Introduction to Databases covers database theory comprehensively while introducing several open source tools. Programming exercises are challenging. Jennifer Widom, now the Dean of Stanford’s School of Engineering, is clear and precise. It has a 4.61-star weighted average rating over 59 reviews.
DataCamp’s Importing & Cleaning Data Tracks (one in Python and one in R) excel at teaching the mechanics of preparing your data for analysis and/or visualization. There are no reviews for these courses on the review sites used for this analysis.
Udacity’s Data Analysis with R is an enjoyable introduction to exploratory data analysis. The expert interviews with Facebook’s data scientists are insightful and inspiring. The course has a 4.58-star weighted average rating over 19 reviews. It also serves as a light introduction to R.
Frank Kane’s Big Data series teaches all of the most popular big data technologies, including over 25 in the “Ultimate” course alone. Kane shares his knowledge from a decade of industry experience working with distributed systems at Amazon and IMDb. Together, the courses have a 4.52-star weighted average rating over 6,932 reviews.
Software skills are an oft-overlooked part of a data science education. Udacity’s testing, debugging, and version control courses introduce three core topics relevant to anyone who deals with code, especially those in team-based environments. Together, the courses have a 4.34-star weighted average rating over 68 reviews. Georgia Tech and Udacity have a new course that covers software testing and debugging together, though it is more advanced and not all relevant for data scientists.
Johns Hopkins University’s Building a Data Science Team provides a useful peek into data science in practice. It is an extremely short course that can be completed in a handful of hours and audited for free. Ignore its 3.41-star weighted average rating over 12 reviews, some of which were likely from paying customers.
Dr. Barbara Oakley’s Learning How to Learn and Mindshift aren’t data science courses per se. Learning How to Learn, the most popular online course ever, covers best practices shown by research to be most effective for mastering tough subjects, including memory techniques and dealing with procrastination. In Mindshift, she demonstrates how to get the most out of online learning and MOOCs, how to seek out and work with mentors, and the secrets to avoiding career ruts and general ruts in life. These are two courses that everyone should take. They have a 4.74-star and a 4.87-star weighted average rating over 959 and 407 reviews, respectively.
The Future of This Guide
The Data Science Career Guide will continue to be updated as new courses are released and ratings and reviews for them are generated.
Are you passionate about another discipline? Would you like to help educate the world? If you are interested in creating a Career Guide similar in structure to this one, drop us a note at [email protected].
As for my future, I have decided to take a position with Udacity as a Content Developer. That means I’ll be creating and teaching courses. That also means that this guide will be updated by somebody else.
I’m joining Udacity because I believe they are creating the best education product on the planet. Of all of the courses I have taken, online or at university, I learned best while enrolled a Nanodegree. They are incorporating the latest in pedagogy and production, and still boast the best-in-class project review system, upbeat instructors, and healthy student and career support teams. Though a piecewise approach like the one we took in this guide can work, a cohesive program with projects and reviews throughout is much more student-friendly.
Updating the Data Analyst Nanodegree is my first task, which is a part of a larger effort to create a clear path of Nanodegrees for all things data. Students will soon be able to start from scratch with data basics at Udacity and progress all the way through machine learning, artificial intelligence, and even self-driving cars if they wish. Stay tuned.
About Class Central Career Guides
Class Central Career Guides are recommendations for the best online courses and MOOCs.
Class Central Career Guides are recommendations for the best online courses and MOOCs. They have one goal: to enable you to quickly figure out which courses can help you learn new skills and advance your career. Our editorial picks are thoroughly researched using reviews written by Class Central users, as well as data from other sources and our own subjective analysis.
These guides are updated frequently to always reflect the best in online education.
Drop us a note at [email protected] if you have any feedback or requests for particular career guides — it will help us prioritize. Also, reach out to us if you want to help us create more of these career guides. We are looking for contributors!
David Venturi created a personalized data science master’s curriculum for himself using MOOCs. He has a dual degree in Chemical Engineering and Economics, and especially enjoys math, stats, and coding. He’s a huge baseball and hockey fan, and writes about the latter with a focus on analytics.