Introduction to Data Science in Python

University of Michigan via Coursera

Go to class Write review

Details

Go to class

Provider

Coursera
Pricing

Free Online Course (Audit)
Languages

English
Certificate

Paid Certificate Available
Duration & workload

1 day 10 hours 52 minutes
Sessions

On-Demand
Level

Intermediate
Subtitles

Arabic, French, Portuguese, Italian, German, Russian, English, Spanish, Korean, Thai, Indonesian, Hindi, Pashto, Bengali, Chinese, Hungarian, Ukrainian, Urdu, Kazakh, Swedish, Greek, Japanese, Azerbaijani, Polish, Farsi, Dutch, Turkish

Found in

Part of

Applied Data Science with Python

Overview

Class Central Tips

This course will introduce the learner to the basics of the python programming environment, including fundamental python programming techniques such as lambdas, reading and manipulating csv files, and the numpy library. The course will introduce data manipulation and cleaning techniques using the popular python pandas data science library and introduce the abstraction of the Series and DataFrame as the central data structures for data analysis, along with tutorials on how to use functions such as groupby, merge, and pivot tables effectively. By the end of this course, students will be able to take tabular data, clean it, manipulate it, and run basic inferential statistical analyses. This course should be taken before any of the other Applied Data Science with Python courses: Applied Plotting, Charting & Data Representation in Python, Applied Machine Learning in Python, Applied Text Mining in Python, Applied Social Network Analysis in Python.

Syllabus

Fundamentals of Data Manipulation with Python

In this week you'll get an introduction to the field of data science, review common Python functionality and features which data scientists use, and be introduced to the Coursera Jupyter Notebook for the lectures. All of the course information on grading, prerequisites, and expectations are on the course syllabus, and you can find more information about the Jupyter Notebooks on our Course Resources page.

Basic Data Processing with Pandas

In this week of the course you'll learn the fundamentals of one of the most important toolkits Python has for data cleaning and processing -- pandas. You'll learn how to read in data into DataFrame structures, how to query these structures, and the details about such structures are indexed.

More Data Processing with Pandas

In this week you'll deepen your understanding of the python pandas library by learning how to merge DataFrames, generate summary tables, group data into logical pieces, and manipulate dates. We'll also refresh your understanding of scales of data, and discuss issues with creating metrics for analysis. The week ends with a more significant programming assignment.

Answering Questions with Messy Data

In this week of the course you'll be introduced to a variety of statistical techniques such a distributions, sampling and t-tests. The week ends with two discussions of science and the rise of the fourth paradigm -- data driven discovery.

Taught by

Christopher Brooks, Kevyn Collins-Thompson, Daniel Romero and V. G. Vinod Vydiswaran

Reviews

2.4 rating, based on 46 Class Central reviews

4.5 rating at Coursera based on 26915 ratings

Start your review of Introduction to Data Science in Python

Anonymous

Lectures are too fast. They don't explain anything, just running through examples. For example they use a function but they don't explain what arguments it takes so you have to read about it elsewhere.
Paul Leitner

a little background on me - I have taken 10+ online courses, good, bad and everything in between. I work in business intelligence and have a very solid background in various dialects of SQL, work with quite a bit of python. Frankly, I find this cou…

a little background on me - I have taken 10+ online courses, good, bad and everything in between. I work in business intelligence and have a very solid background in various dialects of SQL, work with quite a bit of python.

Frankly, I find this course to be TERRIBLE. Here are the main reasons why:

-) the instructor does not teach - he reads over a script that mentions every functionality ONCE, at a very high speed. forget about practice. students are left to figure out how to work the assignments, while doing the assignments - this is exactly NOT how to acquire a solid grasp of the material. I can not count the amount of times "stackoverflow" is mentioned in the videos.

-) most assignments are autograded. which is a good idea in principle. in this case, it is nothing short of excruciatingly annoying. the comment threads in the forms reporting problems (ambiguous error messages, 0% for no reason, OUTDATED SOFTWARE LIBRARIES IN THE AUTOGRADER you name it) range in the hundreds.

Finally an example: I week 3 the assignment goes well beyond what was even mentioned in the videos, openin with the following paragraph:

"This assignment requires more individual learning then the last one did - you are encouraged to check out the pandas documentation to find functions or methods you might not have used yet, or ask questions on Stack Overflow and tag them as pandas and python related. And of course, the discussion forums are open for interaction with your peers and the course staff."

Ok fair enough - I fired up my IDE, spent 3+ hours on Problem 1 (20% of the grade, the assignment is supposed to be doable in 2 hours. let me assure you, if you have not done this before and need to look up the functions you need that were NOT mentioned in the videos, this is absolutely impossible. 8 hours is realistic, 6 if you're good) - I got all the data cleaned, copied my code into the online notebook and voila - it crashed. After (ANOTHER) half hour of googling I noted that the pandas library that's used to grade the course is slightly outdated... it's 2 years old. 2 YEARS!

not some additional library. PANDAS - the MAIN library of the course is so outdated as to require SIGNIFICANTLY different code from what you would use nowadays, in practice. This sums up the course pretty much perfectly.

In my opinion, take an equivalent course somewhere else or buy the book and figure it out yourself, which is what you are left to do in any case if you take this course.

Sorry coursera, this one is just terrible.
Anonymous

The course in and of itself is not _terrible_, but expect to do a lot of searching for outside help on Stack Overflow and the like as the lectures do not provide anywhere near sufficient material to solve the problems. This is pretty much to be exp…

The course in and of itself is not _terrible_, but expect to do a lot of searching for outside help on Stack Overflow and the like as the lectures do not provide anywhere near sufficient material to solve the problems. This is pretty much to be expected these days, but the lectures aren't really sufficient to solve the material. I personally found it more worthwhile to just skip the lectures since they were fairly lengthy and didn't provide all of the necessary information a nyway. Also, the "expected time" on the assignments could easily be tripled or quadrupled (if not even moreso) -- a fact corroborated with a lot of other senior developers in there (trust me, all of this information is VERY common among all participants -- not just my "sour grapes".) The first programming assignment states it is "90 minutes" which is a total joke, definitely plan on 8 hours if you're new to pandas. If you haven't even used Python before, you might be in even more of a world of hurt.

The real reason this class is no good is that the autograder has constant and seemingly incessant/intractable problems. It's enough of a challenge just to get your interactive Python notebook to display the right values, but VERY frustrating when the autograder then does not recognize the values. This is a chronic problem experienced by a huge number of students as the message forums indicate and the only "help" comes in the form of "well, the interactive Python notebook isn't the same as the compiled code" with subsequent "solutions" and "workarounds" for the problem given by the staff that are either not straightforward or just simply don't work. Needless to say, for $49 you get what you pay for with a lot of these classes but expect some serious frustration dealing with these issues. I'm giving this course an extra star since the assignments do help you learn Pandas pretty quickly by "throwing you in the deep end" but my guess is there's much much better data science courses out there.
Rtodyssey

Background: I have some basic programming understanding of loops, functions and data structures in a couple of languages. I wanted a course to give me strong fundamentals of Python for usage in Data Science. Course: The videos give an overview o…

Background:
I have some basic programming understanding of loops, functions and data structures in a couple of languages. I wanted a course to give me strong fundamentals of Python for usage in Data Science.

Course:
The videos give an overview of pandas, python and numpy. Some of the functionalities are explained which is accompanied by a notebook of sample codes to help. The assignments are a different ballgame. The week 2's assignment is fairly based on what is taught in the course for that week, while a little bit of research was needed from Stackoverflow and Pandas documentation. Over the next two weeks, the divergence increase. The amount of data cleaning needed to do increases with each week, with the last week's assignment we are expected to make a dataframe out of a simple copy paste of text from wikipedia page.

Verdit:
I found the course very helpful for the reason that it forced from my comfort zone. If the assignments were mainly from the week's material, i would have used them from memory and forgotten later. They have forced me to go research online, read documentation, look at forums and forced me to do many iterations of figuring out how to solve a piece of code in pandas - which in my opinion is an extremely valuable skill considering the vast ocean of the subject. Also, my experience with industry data has been that data cleaning is one of the most crucial parts of any analysis and it is cumbersome, which is again something the course focused on.

While other reviews have downrated this course for being difficult and the assignments diverging from the lectures, I am giving this a 5 precisely for that reason.
Anonymous

The worst course I've ever taken. Some of the stuff in there is useful, but this isn't really a "course." It's more like a book on tape. The professor is literally reading a transcript and it sounds like he's reading a kids book talking about data science. He constantly does these unnecessary hand gestures and goes slow through the stuff that is easy, but fast through the stuff that needs to be explained. he doesn't really explain the reasons behind anything... Like I said, it sounds like he's reading a book. I was very annoyed watching his videos.
Anonymous

The lecturer puts minimal effort to the videos, information are scarce and difficult to understand.
The assignments have a really steep learning curve, and are too difficult to complete, provided the topics covered by the lecturer.
Help from the teaching staff is kept to a minimum, and most students don't actually manage to complete the assignments
In conclusion, the worst course i've ever taken in my academic life.
D C

This course is fast, but it's not the good kind of challenging. The instructor sounds like he's reading from a script, and there's almost no explanation of anything, even basic pandas syntax. "Here's a function you can use," and then just types it out without any explanation of, e.g., what parameters are mandatory, what options there are, and what they mean.

The result is that each 7-min video takes me hours to work through and think about, and I'm still left with many questions. And no, I'm not a beginner to python. I'm honestly not sure if I'll finish the course at this point, though I'm halfway through.
Anonymous

For sure is a challenging course, but I miss more efforts when it comes to explain "Lambda" or "List Comprehension" . Actually, I had to google a lot of times just to understand basic concepts of those functions -I'm not a Python noob though.

The "tasks" during the videos are a bit frustrating, it feels like "here's a formal definition of what Lambda is, now manage to solve something you probably won't understand because I didn't tell you how it works".
Graham C

Really excellent course. Fast paced so be prepared to 'pause' to research or think about things. Doesn't spoon feed you so a bit of googling required now and again. Challenging assignments really make you think. Auto-grader for assignments has been buggy but is being fixed. Suggest you know Python a bit before starting.
The course assignment can be graded without paying for the course - very generous functionality compared to most other courses where this is locked down.
Great first session, cant wait for the next!
Julián Urrea

The course is definitely NOT for beginers in python. It's more than just challenging, sometimes, you don't know how to continue!!!, so you feel you want to quit at some point. What I loved the most, was the collaboration between students in the forum. A lot of students with great experience always ready to help. Sadly, I never saw a mentors reply. But, I think, once you complete, you can say that you lerned very interesting thing to do with pandas...
Juan Velasquez

Find another course. I got the impression that the professor was just rapidly reading from a script and wasn't really interested in the student's progress. He seemed, as another poster noted, "disconnected" and looked on teaching the course as a necessary evil. Most of the assignments were disconnected from the material being taught.
Anonymous

They shouldn't advertise that you can learn python in this class. The first part of the specialization is terrible at teaching the language, and a beginner will get lost and discouraged right away. So many crucial building blocks are skipped over along the way, that I don't even see the point of them starting with a couple of basic subjects. You have to know python to take on this specialization and get the most out of it. Having the professor expect you to learn everything from Google is not the way to go, and is a terrible waste of people's time and money.
Anonymous

Disconnect with the word "Introduction"... lecture goes from basic to quiz that assumes advanced knowledge. Think: Chem 101 to build a rocket engine the next day.

Stick with Dr Chuck's python course if you want to learn at the Intro level.
Jeff Trawick

The presentation of this class is poor. Most of the time the lecturer is describing important code concepts (down to square brackets, commas, etc.) using only speech, with no visual cues (i.e., written code to look at). Inconceivable! If that's n…

The presentation of this class is poor. Most of the time the lecturer is describing important code concepts (down to square brackets, commas, etc.) using only speech, with no visual cues (i.e., written code to look at). Inconceivable! If that's not bad enough, the background shows people supposedly working at their desks; thus the lecture "view" is dominated by artifacts that are not pertinent to the material. At intervals, the view switches to a Jupyter Notebook, and the lecturer walks through the material far too fast to allow anything to "sink in." (Luckily I've used Pandas in the past and am able to find other materials once I figure out the point of the lecture.)

This is the first Coursera course I've paid for. I'm very disappointed, having been accustomed to excellent instruction in previous Coursera MOOCs. I hate writing this review, because I know that a lot of work went into the class, and I'm very grateful to Coursera for the tremendous enrichment I've received in the past. But there must be high standards of instruction for a resource like Coursera to remain so valuable.

I find the outline of the series of courses very compelling, as it should take me to the next level on several topics I've worked with in the past. For now I will continue, with the expectation that I need to use the videos and homework assignments to discover the detailed objectives for the week, and I will use Pluralsight, Lynda, "Python for Data Analysis 2e," etc. to actually learn them :(
Anonymous

This course is honestly not good for Python beginners despite the name. Greatly ramps itself up in difficulty when week 2 comes around, probably due to the one week free trial period.

Lots of functions and methods lack explanation and the response is to do research in Stackoverflow.

I'm hating life right now
Anonymous

Too fast and just talking through the typing of syntax is just not the way I learn. Nothing like the courses Charles Severance teaches. This is NOT teaching but rather talking quickly through syntax. NOT HELPFUL!
Anonymous

This is a pretty awful course, as of the time of writing this review in July of 2019. Let me preface this by saying that the material you learn is very helpful. Pandas is a great library to learn for loading, cleaning, and manipulating large amounts…

This is a pretty awful course, as of the time of writing this review in July of 2019. Let me preface this by saying that the material you learn is very helpful. Pandas is a great library to learn for loading, cleaning, and manipulating large amounts of data. But the real problem with this course isn't the material, it's the lectures and the autograder. The lectures are very short. They don't cover the concepts well enough, and some material is blatantly skipped and you have to learn it yourself through google. Then comes the video quizzes which test you on functions and concepts that haven't even been introduced yet. It's like the quizzes were put at timestamps randomly. Then comes the worst part of the course: the Autograder. I'm not sure how old this course is but the autograder is running on an outdated version of both python and pandas. What does this mean for you? Well if you want to code on your computer instead of the course's broken online coding notebook, you will run into severe code-breaking bugs between versions. It really ruins the course. I learn the material but then spend hours trying to please the broken autograder. Most of the time in this course isn't spent learning, it's spent fixing code that the autograder rejects even though it runs perfectly on your machine locally. Have fun!
Mark Adelhelm

I would agree with many of the criticisms offered here. While the Coursera team has done a good job of packaging this to make it easy to navigate, the organization of the content and the lecture coverage is insufficient to be prepared for the exerc…

I would agree with many of the criticisms offered here. While the Coursera team has done a good job of packaging this to make it easy to navigate, the organization of the content and the lecture coverage is insufficient to be prepared for the exercises assigned. I was faithfully plowing through the first half of the course and got to the exercises at the end of week 2 and was like "how did I miss the instruction to solve this problem?" Then I started reading all of the "help, I'm lost" posts to the exercise and realized I was not alone. The sad thing is that I convinced myself that I would not give up and could figure this out so I kept paying the monthly $50 to extend the course. A complete waste of money I now realize. One of the most valuable pieces of instruction I got from the course was to buy Wes McKinney's "Python for Data Analysis" text and Matt Harrison's "Learning the Pandas Library". These two volumes are MUCH better organized and in depth than the course itself. Invest in these and do the exercises provided in them and you won't need this course.
Anonymous

Like many of my fellow reviewers, I was not satisfied with the quality and level of instruction for this course. The content was really light and fast, with little examples. The course production itself was kind of choppy, with the lecturer being…

Like many of my fellow reviewers, I was not satisfied with the quality and level of instruction for this course. The content was really light and fast, with little examples. The course production itself was kind of choppy, with the lecturer being interrupted mid-sentence with "pop-quizzes" on topics he was just delivering. It was more like he was talking to a slightly less knowledgeable Python programmer delivering "reminders", then teaching paying (or non-paying) students in the subject.

The difficulty of assignments was way beyond the level taught or discussed. BUT, the course has apparently been around long enough that exact questions and answers can be found through simple Google searches. I am not a lazy or uninformed student, but I dropped the course when I realized the only thing I was learning was how to cut, paste and obscure other's work, Not produce correct answers through the application of what had been taught.
Paulo Eduardo Neves

I really appreciated this course. The assignments are excellent, but they took me more time than the announced.

The ability to submit your assignments and have them automatically corrected, even if you are note paying for the certificate, is great.

I just think that maybe it is a "too hard" introduction. You must already know python, and, I'd say, should have already studied a little of pandas. The explanation of pandas is really quick, but full of valuable real world tips.

For the assignments you'll need a lot of pandas knowledge that isn't the videos, so prepare for a lot of searching in StackOverflow and in the docs. I believe it is purposeful, so the assignments mimics a real world problem.