Statistical experiment design and analytics are at the heart of data science. In this course you will design statistical experiments and analyze the results using modern methods. You will also explore the common pitfalls in interpreting statistical arguments, especially those associated with big data. Collectively, this course will help you internalize a core set of practical and effective machine learning methods and concepts, and apply them to solve some real world problems.
Learning Goals: After completing this course, you will be able to:
1. Design effective experiments and analyze the results
2. Use resampling methods to make clear and bulletproof statistical arguments without invoking esoteric notation
3. Explain and apply a core set of classification methods of increasing complexity (rules, trees, random forests), and associated optimization methods (gradient descent and variants)
4. Explain and apply a set of unsupervised learning concepts and methods
5. Describe the common idioms of large-scale graph analytics, including structural query, traversals and recursive queries, PageRank, and community detection
Practical Statistical Inference
Learn the basics of statistical inference, comparing classical methods with resampling methods that allow you to use a simple program to make a rigorous statistical argument. Motivate your study with current topics at the foundations of science: publication bias and reproducibility.
Follow a tour through the important methods, algorithms, and techniques in machine learning. You will learn how these methods build upon each other and can be combined into practical algorithms that perform well on a variety of tasks. Learn how to evaluate machine learning methods and the pitfalls to avoid.
You will learn how to optimize a cost function using gradient descent, including popular variants that use randomization and parallelization to improve performance. You will gain an intuition for popular methods used in practice and see how similar they are fundamentally.
A brief tour of selected unsupervised learning methods and an opportunity to apply techniques in practice on a real world problem.
Start your review of Practical Predictive Analytics: Models and Methods
Marcio Gualtieri completed this course, spending 3 hours a week on it and found the course difficulty to be very easy.
This course is quite outdated. I didn't learn much beyond what I already knew before I started. The Spark courses from edX are way better than these. Hopefully "Big Data Analysis with Scala and Spark" from the "École Polytechnique Fédérale de Lausanne"...
This course is quite outdated. I didn't learn much beyond what I already knew before I started. The Spark courses from edX are way better than these. Hopefully "Big Data Analysis with Scala and Spark" from the "École Polytechnique Fédérale de Lausanne" (also from Coursera) is good (I know their Scala courses, which are taught by Martin Odersky, are quite good).
There are very few quizzes between lectures and the assignments are not very challenging.
Many of the videos, specially the ones at the end were extremely rushed over. They serve more as a review if you know the subject, otherwise I don't think most people will get much from them.
The audio isn't very good for most of the lectures, many having an very annoying chirping sound (from when you leave an old flip phone near a computer... "teh-teh-teh teh-teh-teh teh-teh-teh teh-tehhhhhh....". Gosh, I haven't heard this sound in maybe over five years...).
The Kaggle competition at the end of the course can be fun if you do the hard work, but you don't need to put much of an effort to pass. I know that the submissions I peer reviewed were quite poor, but the grading criteria that we need to follow as reviewers is quite vague and not very thorough. You also run the risk of getting a lesser grade than you deserve because your reviewer is incompetent, which is a bummer... At the moment the course has very few people taking it (the same people I peer reviewed, also reviewed me, which leads to me to believe that maybe only 3 or 4 people were taking this course during the November 2016 iteration).