How can data-intensive research methods be used to create more equitable and effective learning environments? In this course, you will learn how data from digital learning environments and administrative data systems can be used to help better understand relevant learning environments, identify students in need of support, and assess changes made to learning environments.
This course pays particular attention to the ways in which researchers and data scientists can transform raw data into features (i.e., variables or predictors) used in various machine learning algorithms. We will provide strategies for using prior research, knowledge from practice, and logic to create features, as well as build and evaluate machine learning models. The process of building features will be discussed within a broader data-intensive research workflow using R.
Week 1: Finding features
Introduction to setting up a feature engineering workflow, which includes identifying problems of practice, relevant research, and brainstorming potential features.
Week 2: Data wrangling and visualization
Introduction to data wrangling, data visualization techniques, and structure discovery algorithms. Integrating theory, knowledge from practice, logic, and contextual factors into feature engineering will also be discussed.
Week 3: Modeling features
Introduction to using features within selected machine learning algorithms (e.g. logistic regression and decision tree) and the tradeoffs between interpretability and prediction.