Statistical experiment design and analytics are at the heart of data science. In this course you will design statistical experiments and analyze the results using modern methods. You will also explore the common pitfalls in interpreting statistical arguments, especially those associated with big data. Collectively, this course will help you internalize a core set of practical and effective machine learning methods and concepts, and apply them to solve some real world problems.
Learning Goals: After completing this course, you will be able to:
1. Design effective experiments and analyze the results
2. Use resampling methods to make clear and bulletproof statistical arguments without invoking esoteric notation
3. Explain and apply a core set of classification methods of increasing complexity (rules, trees, random forests), and associated optimization methods (gradient descent and variants)
4. Explain and apply a set of unsupervised learning concepts and methods
5. Describe the common idioms of large-scale graph analytics, including structural query, traversals and recursive queries, PageRank, and community detection
Practical Statistical Inference
-Learn the basics of statistical inference, comparing classical methods with resampling methods that allow you to use a simple program to make a rigorous statistical argument. Motivate your study with current topics at the foundations of science: publication bias and reproducibility.
-Follow a tour through the important methods, algorithms, and techniques in machine learning. You will learn how these methods build upon each other and can be combined into practical algorithms that perform well on a variety of tasks. Learn how to evaluate machine learning methods and the pitfalls to avoid.
-You will learn how to optimize a cost function using gradient descent, including popular variants that use randomization and parallelization to improve performance. You will gain an intuition for popular methods used in practice and see how similar they are fundamentally.
-A brief tour of selected unsupervised learning methods and an opportunity to apply techniques in practice on a real world problem.