What you'll learn:
- The algorithm behind recursive partitioning decision trees
- Construct conditional inference decision trees with R`s ctree function
- Construct recursive partitioning decision trees with R`s rpart function
- Learn to estimate Gini´s impurity
- Construct ROC and estimate AUC
- Random Forests with R´s randomForest package
- Gradient Boosting with R´s XGBoost package
- Deal with missing data
Would you like to build predictive models using machine learning? That´s precisely what you will learn in this course “Decision Trees, Random Forests and Gradient Boosting in R.” My name is Carlos Martínez, I have a Ph.D. in Management from the University of St. Gallen in Switzerland. I have presented my research at some of the most prestigious academic conferences and doctoral colloquiums at the University of Tel Aviv, Politecnico di Milano, University of Halmstad, and MIT. Furthermore, I have co-authored more than 25 teaching cases, some of them included in the case bases of Harvard and Michigan.
This is a very comprehensive course that includes presentations, tutorials, and assignments. The course has a practical approach based on the learning-by-doing method in which you will learn decision trees and ensemble methods based on decision trees using a real dataset. In addition to the videos, you will have access to all the Excel files and R codes that we will develop in the videos and to the solutions of the assignments included in the course with which you will self-evaluate and gain confidence in your new skills.
After a brief theoretical introduction, we will illustrate step by step the algorithm behind the recursive partitioning decision trees. After we know this algorithm in-depth, we will have earned the right to automate it in R, using the ctree and rpart functions to respectively construct conditional inference and recursive partitioning decision trees. Furthermore, we will learn to estimate the complexity parameter and to prune trees to increase the accuracy and reduce the overfitting of our predictive models. After building the decision trees in R, we will also learn two ensemble methods based on decision trees, such as Random Forests and Gradient Boosting. Finally, we will construct the ROC curve and calculate the area under such curve, which will serve as a metric to compare the goodness of our models.
The ideal students of this course are university students and professionals interested in machine learning and business intelligence. The course includes an introduction to the decision trees algorithm so the only requirement for the course is a basic knowledge of spreadsheets and R.
I hope you are ready to upgrade yourself and learn to optimize investment portfolios with excel and R. I´ll see you in class!