Week 1 : Factor analysis . Quite often the amount of variables in the data set under analysis is large, thus the data can not be visualized. This implies a very theoretical approach to obtain some trends or dependencies in the data. Factor analysis is a commonly used machine learning technique to reduce the amount of variables in a dataset. We will thoroughly discuss principal component analysis, but will consider also other factor analysis methods.
Week 2: Multiclass logistic regression . Multiclass logistic regression (or multinomial regression) is a classification method generalizing logistic regression to multiclass case, i.e. when there are more than two possible outcomes. Multiclass LR is used when the dependent variable is nominal and for which there are more than two categories.
Week 3: Resampling and decision trees. Resampling methods are essential to test and evaluate statistical models. For instance, you could draw several samples and then assess the variability and stability of your model on different samples. Decision trees are intuitive concepts for making decisions. They are also widely used for regression and classification. You split all your observations into a number of samples, and predictions are made based on the mean or mode of the training observations in that sample.
Week 4: Support vector machines. SVM is a supervised learning models that are used for classification or regression analysis. We will thoroughly consider a more simple and intuitive classifier called the optimal margin classifier and then proceed to a generalized SVM.
Week 5: Reinforced machine learning. Main principles of reinforcement learning are discussed, that is how to maximize the cumulative feedback of an object’s actions in case when an object interacts with the environment and receives a positive or negative feedback from the environment to its actions. Q-learning method will be considered in details.