In this course students will explore supervised machine learning techniques using the python scikit learn (sklearn) toolkit and real-world athletic data to understand both machine learning algorithms and how to predict athletic outcomes. Building on the previous courses in the specialization, students will apply methods such as support vector machines (SVM), decision trees, random forest, linear and logistic regression, and ensembles of learners to examine data from professional sports leagues such as the NHL and MLB as well as wearable devices such as the Apple Watch and inertial measurement units (IMUs). By the end of the course students will have a broad understanding of how classification and regression techniques can be used to enable sports analytics across athletic activities and events.
Machine Learning Concepts
This week will introduce the concept of machine learning and describe the four major areas of places it can be used in sports analytics. The machine learning pipeline will be discussed, as well as some common issues one runs into when using machine learning for sports analytics.
Support Vector Machines
In this week students will learn how Support Vector Machines (SVM) work, and will experience these models when looking at both baseball and wearable data. Coming out of the week students will have experience building SVMs with real data and will be able to apply them to problems of their own.
This week will focus on interpretable methods for machine learning with a particular focus on decision trees. Students will learn how these models work in general, and see special uses of decision trees in combination with regression methods. In this week students will come to better understand how the python sklearn toolkit can be used for a breadth of supervised learning tasks.
Ensembles & Beyond
In this week of the course students will learn how many different models can be used together through ensembles, including the random forest method as a common use, as well as more general methods available in sklearn such as stacking and bagging. By the end of this week students will have a broad understanding of how methods such as SVMs, decision trees, and logistic regression can be used together to solve a problem with increasing performance.