Welcome to Introduction to Predictive Modeling, the first course in the University of Minnesota’s Analytics for Decision Making specialization.
This course will introduce to you the concepts, processes, and applications of predictive modeling, with a focus on linear regression and time series forecasting models and their practical use in Microsoft Excel. By the end of the course, you will be able to:
- Understand the concepts, processes, and applications of predictive modeling.
- Understand the structure of and intuition behind linear regression models.
- Be able to fit simple and multiple linear regression models to data, interpret the results, evaluate the goodness of fit, and use fitted models to make predictions.
- Understand the problem of overfitting and underfitting and be able to conduct simple model selection.
- Understand the concepts, processes, and applications of time series forecasting as a special type of predictive modeling.
- Be able to fit several time-series-forecasting models (e.g., exponential smoothing and Holt-Winter’s method) in Excel, evaluate the goodness of fit, and use fitted models to make forecasts.
- Understand different types of data and how they may be used in predictive models.
- Use Excel to prepare data for predictive modeling, including exploring data patterns, transforming data, and dealing with missing values.
This is an introductory course to predictive modeling. The course provides a combination of conceptual and hands-on learning. During the course, we will provide you opportunities to practice predictive modeling techniques on real-world datasets using Excel.
To succeed in this course, you should know basic math (the concept of functions, variables, and basic math notations such as summation and indices) and basic statistics (correlation, sample mean, standard deviation, and variance). This course does not require a background in programming, but you should be familiar with basic Excel operations (e.g., basic formulas and charting). For the best experience, you should have a recent version of Microsoft Excel installed on your computer (e.g., Excel 2013, 2016, 2019, or Office 365).
Week/Module 1: Simple Linear Regression
This module provides a brief overview of predictive modeling problems, illustrating their broad applications. It then focuses on the simplest form of predictive models: simple linear regression. The module follows a graphical approach to illustrate the structure of a simple linear regression model, the intuition for Ordinary Least Squares, and related concepts. Finally, we demonstrate how to use various Excel tools, including trendlines, the Regression tool, and the Trend() function, to fit a simple linear regression model and use it to form predictions.
Week/Module 2: Multiple Linear Regression
Building on Week 1, in this week we introduce multiple linear regression and its broad applications. Then, we cover how to fit a multiple linear regression model using Excel’s Regression tool and Trend() function and use the resulting model for predictions. The module further discusses the overfitting/underfitting problems and the basic principles of a good regression model. The module also introduces one approach for selecting a good model: backward elimination that can be implemented in Excel.
Week/Module 3: Data Preparation
In this week, we will learn how to prepare a dataset for predictive modeling and introduce Excel tools that can be leveraged to fulfill this goal. We will discuss different types of variables and how categorical, string, and datetime values may be leveraged in predictive modeling. Furthermore, we will discuss the intuition for including high-order and interaction variables in regression models, the issue of multicollinearity, and how to handle missing values. We will also introduce several handy Excel tools for data handling and exploration, including Pivot Table, IF() function, VLOOKUP function, and relative reference.
Week/Module 4: Time Series Forecasting
This module focuses on a special subset of predictive modeling: time series forecasting. We discuss the nature of time-series data and the structure of time series forecasting problems. We then introduce a host of time series models for stationary data and data with trends and seasonality, with a focus on techniques that are easily implemented within Excel, including moving average, exponential smoothing, double moving average, Holt’s method, and Holt-Winters’ method. The module also covers linear-regression-based forecasting and a composite forecasting technique for boosting accuracy.