Data science has become ubiquitous. More and more companies are looking forward to harnessing the vast amounts of data they have for improving their efficiency, reducing costs, and increasing sales. Being a chemical engineer working in a large fertilizer company, we too have vast amounts of data coming from various instruments of the plants. A careful analysis of these data can help us optimize production, improve process safety, and increase production.
Why I took this course
Traditionally, I used Microsoft Excel to do the analyses. However, as the data size increased, handling it in Excel became a problem. In the last quarter of 2020, I accidentally came upon Predictive Modeling and Machine Learning with MATLAB by MATHWORKS, the makers of MATLAB, and decided to explore what it can offer. This course is prepared and structured very well. It gives an excellent introduction to various data types and how to handle them. This course is also part of the Practical Data Science with MATLAB Specialization.
The course offered by MATHWORKS provides an excellent entry into the vast and ever-growing field of data science. The course starts with how we can visualize and clean data for better analyses before introducing you to unsupervised learning. Following unsupervised learning, the principle behind principal component analysis (PCA) is also explained in the simplest way. The math behind it can be obtained, based on personal interest, from other resources. Then you are taken to supervised learning. The machine learning apps are the best of what MATLAB offers, which not only helps you create different models very fast and analyze but also generate the code behind it which can help you to later use code only for creating the model. It is only when you have the raw code you can do any kind of adjustments or modifications you may want.
In the first part of the course is Exploratory data analysis. This part describes how to import data into MATLAB’s workspace for further processing and analysis. Raw data is messy. The first step is cleaning the data to be made suitable for valuable studies. MATLAB’s built-in functions are being enumerated here. Checking and removing or replacing missing values, outliers and other inconsistent data are described here. Various graphical visualization techniques explained in this module demonstrates the powerful capabilities of MATLAB in visualizing data. Deducing patterns, examining the trends and the ability to conclude items using various mathematical and statistical measures are described.
The second part describes Feature Engineering and Unsupervised Learning. Using the existing variables, one can create new variables that describe the system or process presented here. Unsupervised learning involves the grouping or classifying the data based upon their relationship with other variables. The instructors explain dimensionality reduction techniques like Principal Component Analyses for data sets with many variables.
The third part of the course covers the primary intention of the study – Machine Learning. The chapters cover how the major machine learning algorithms work and how we can execute them in MATLAB. The instructors clearly explain the purpose of machine learning techniques like regression learning, regression classification, support vector machines, and bagged ensembles. The taxi data of New York City is used as an example in this learning module.
The Material and Discussion forum
While the course itself is excellent, providing course material or write-up that covers the details of the subject taught can help the students for quick reference. The presentation details are good enough to be included in the course material.
The course is condensed. Please, be prepared to code along with the lectures. You may want to watch it more than one time to really get an intuition of what is happening. The forums of discussion also provide excellent information and you may raise your queries there which are usually responded to promptly by the teaching staff.
The course clearly demonstrates the ‘muscle power’ MATLAB has in dealing with data. Backed up with the already vast arrays of built-in functions, you can wrangle and manipulate the data in any way you want.
Based on the above review, I highly recommend students looking forward to data science, primarily industrial data science, as MATLAB already has many other functions that can help in scientific computations.
To MATHWORKS teaching staff: a BIG THANK YOU from my side. Looking forward to more such courses. I may request you to provide the students with a pdf file of the course contents covering the major aspects taught in the course.