This is an introductory-level course in supervised learning, with a focus on regression and classification methods. The syllabus includes: linear and polynomial regression, logistic regression and linear discriminant analysis; cross-validation and the bootstrap, model selection and regularization methods (ridge and lasso); nonlinear models, splines and generalized additive models; tree-based methods, random forests and boosting; support-vector machines. Some unsupervised learning methods are discussed: principal components and clustering (k-means and hierarchical).
This is not a math-heavy class, so we try and describe the methods without heavy reliance on formulas and complex mathematics. We focus on what we consider to be the important elements of modern data analysis. Computing is done in R. There are lectures devoted to R, giving tutorials from the ground up, and progressing with more detailed sessions that implement the techniques in each chapter.
Hchancompleted this course, spending 3 hours a week on it and found the course difficulty to be easy.
First, a disclaimer: the online exercises of this course are extremely thin, so your score in this class is neither necessary or sufficient to gain mastery of the material. It helps if you think of this course as supplementary material for the book (An Introduction to Statistical Learning by James, Witten, Hastie, Tibshirani). In this light, the course becomes an exceptional gem, because the book is really incredibly good. My recommendation is to take the time to read the book cover to cover, trying many of the excellent exercises in it. Then, as a recap or a refresher, go through this online course. The lectures highlight the most important parts of each chapter and are beautifully paced and presented. You will find that they are a perfect complement to the book and many concepts will become clearer and more concretely established in your mind. However, if you try to take this as a stand-alone course, you will be disappointed and likely not learn or retain very much.
Prose is taking this course right now, spending 4 hours a week on it and found the course difficulty to be medium.
Good book, terrible MOOC.
First of all: huge kudos to Hastie and Tibshirani for their contributions to the field, and making their seminal books freely available. None of this is directed personally at them - it's difficult to design a good MOOC. Problems with this one:
- it's basically just a series of lectures presenting the ideas from the book
- without prior familiarity, or lengthy reading, these probably aren't adequate for groking the concepts
- the 'assessment' is less than cursory (a few sudden-death MCQs per week)
- the questions and content for the week are sometimes... loose.
- there's no real opportunity to practice concepts in a concrete way with feedback.
I'll certainly watch the lectures. But these have made their way to youtube, and the 'MOOC parts' weren't worth the signup on Stanford OpenEdu. :(
Took this course (or at least parts of it) on the Stanford Online platform. Unfortunately, this course is geared towards people who already have some knowledge of the mathematics, statistics and programming concepts in a classroom (typically, bachelors degree or significant experience in these areas). The course lectures are a bit drab, honestly. The book is fantastic. I wish the exercises were better thought out. Some of the questions are just arbitrarily hard (with no background content in the lecture or book), and some are just too simple.
Overall, this is a course I'd recommend for anyone who has the time to go back and forth between topics a lot, or for anyone who already has a background in some of the subjects at least at a Bachelor level.
I was really excited about this course and wanted to like it but had to be honest in this review. But do read the book - the writers have done a great job of it.
Procellariacompleted this course, spending 6 hours a week on it and found the course difficulty to be medium.
The course is a good view of the supervised learning methods. Most of the lessons are clear and self-consistent, in some cases, a pre-existing knowledge of statistical concepts is necessary for a full understanding. The teachers pay special attention to introduce to the proper use of the techinques. The R sessions are useful and clear. Nevertheless, the course can be improved in several points (in my opinion Ch9 and Ch10 are hasty, the tree-based methods are introduced properly but the explanation of random rorests and boosting are not completely clear).
Poorly done video lectures in which the instructors simply read from the slides. Also the quiz questions weren't very helpful for testing your knowledge of the material or helping with retention. I got the impression the quiz questions were thrown together last minute simply because the edx platform required quiz questions be inserted at some point.
You-cyuancompleted this course, spending 6 hours a week on it and found the course difficulty to be medium.
Pros: This course will give you a quick introduction to common machine learning algorithms and basic principles of data science implementation. The course material and video is very concise and fun to watch. Recommended for beginner with basic statistical analysis background.
Cons: Only limited programming assignment are provided. I highly recommended you to follow homework/examples in the book.
This was a wonderful course. The professors gave the impression that the material was interesting, learnable and even fun. One has the feeling of actually being in the classroom. I could not help thinking about "car talk" on public radio. A great introduction to valuable, timely information. Yes you have to work at it but you will be rewarded.
Cassiecompleted this course, spending 6 hours a week on it and found the course difficulty to be medium.
It is not a easy class but it worth to spend your time on it. After taking the class, I have a great improvement in R programming and machine learning. And it has a free textbook, which is also a great book.