The Introduction to Data Science class will survey the foundational topics in data science, namely:
Data Analysis with Statistics and Machine Learning
Data Communication with Information Visualization
Data at Scale -- Working with Big Data
The class will focus on breadth and present the topics briefly instead of focusing on a single topic in depth. This will give you the opportunity to sample and apply the basic techniques of data science.
This course is also a part of our Data Analyst Nanodegree.
Why Take This Course?
You will have an opportunity to work through a data science project end to end, from analyzing a dataset to visualizing and communicating your data analysis.
Through working on the class project, you will be exposed to and understand the skills that are needed to become a data scientist yourself.
Lesson 1: Introduction to Data Science
Introduction to Data Science
What is a Data Scientist
Pi-Chaun (Data Scientist @ Google): What is Data Science?
Gabor (Data Scientist @ Twitter): What is Data Science?
Problems Solved by Data Science
Create a New Dataframe
Lesson 2: Data Wrangling
What is Data Wrangling?
Common Data Formats
What are Relational Databases?
Aadhaar Data and Relational Databases
Introduction to Databases Schemas
Data in JSON Format
How to Access an API efficiently
Impute using Linear Regression
Tip of the Imputation Iceberg
Lesson 3: Data Analysis
Kurt (Data Scientist @ Twitter) - Why is Stats Useful?
Introduction to Normal Distribution
Welch T Test
Stats vs. Machine Learning
Different Types of Machine Learning
Prediction with Regression
How to Minimize Cost Function
Coefficients of Determination
Lesson 4: Data Visualization
Effective Information Visualization
Napoleon's March on Russia
Don (Principal Data Scientist @ AT&T): Communicating Findings
Rishiraj (Principal Data Scientist @ AT&T): Communicating Findings Well
Intro to data science is an intermediate level course that assumes basic Python programming skills and knowledge of statistics. The course focuses on gathering, manipulating, analyzing and visualizing data using Python and various Python packages such...
Intro to data science is an intermediate level course that assumes basic Python programming skills and knowledge of statistics. The course focuses on gathering, manipulating, analyzing and visualizing data using Python and various Python packages such as numpy, scipy and pandas. One of the best parts about this course is getting some exposure to some Python packages in the scipy stack, although I wish more time was devoted to explaining what the various modules in the scipy stack do, how to set them up at home and when to use them.
The first lesson was fairly gentle introduction with an interesting homework project dealing with data from the Titanic disaster. Lesson 2 goes into more detail about gathering and cleaning data using Pandas and an additional module that lets you make SQL queries to extract data from Pandas data frames. Lesson 3 jumps into data analysis with a T test and linear regression using gradient descent. Going from basic data manipulation into these topics was a bit jarring in terms of difficulty and more time could have been spent explaining how the functions worked. I left without a great appreciation of what gradient descent is really doing. Lesson 4 is focused on making visualizations using a module that attempts to port the functionality R language’s ggplot2 plotting package. Finally, lesson 5 introduces the concept of big data and MapReduce as a solution to deal with large data sets. Each homework assignment after the first has students dealing with New York subway turnstile data, which allows students to get some level of familiarity with the data throughout the course. This was a very good decision, since it lets students focus on learning new concepts rather than spending time familiarizing themselves with new data sets over and over again.
Joe Foley is taking this course right now, spending 8 hours a week on it and found the course difficulty to be medium.
I was skeptical when I enrolled in UDACITY's Data Analysis Nano Degree Program but not only have they provided the experience they said they would they have steadily made improvements since I enrolled. How many times in your life have you had that...
I was skeptical when I enrolled in UDACITY's Data Analysis Nano Degree Program but not only have they provided the experience they said they would they have steadily made improvements since I enrolled. How many times in your life have you had that experience? Here are SOME of the improvements they have made while I have been enrolled. Initially one could get one-on-one help but usually it was 1 to 2 days out and but then it was video chat.
This was great. I had tried a competitor's course and sometime s one just cannot figure out why something is not working. But not with Udacity. Then they scrapped that and instituted a MENTOR program. Here one could instant message someone who would get back to you in a few hours. Then they scrapped that and now offer LIVE HELP. It is a chat box that one types the gist of your question into. In less than 10 min, often in 3 min , someone comes on. Usually they can immediately figure out your mistake ( it seems students make a finite # of errors) but if they cant they ask you to copy and paste your code. And if they still cannot figure it out, i.e., if you have really made a mess of things they do a screen sharing session to get you back on the rails . Don't make a mistake. Just sign up for Udacity.
Shahrukh Ahmed completed this course, spending 5 hours a week on it and found the course difficulty to be easy.
Though the course uses interesting examples for teaching concepts in relation to data science, the over reliance of the online grader for practice often makes learning redundant. Big part of learning programming is experimentation which the grader does not allow for.