Online Course
Mathematics for Machine Learning: Multivariate Calculus
Imperial College London via Coursera

1.7k

 Write review
Overview
Class Central Tips
Syllabus
Understanding calculus is central to understanding machine learning! You can think of calculus as simply a set of tools for analysing the relationship between functions and their inputs. Often, in machine learning, we are trying to find the inputs which enable a function to best match the data. We start this module from the basics, by recalling what a function is and where we might encounter one. Following this, we talk about the how, when sketching a function on a graph, the slope describes the rate of change of the output with respect to an input. Using this visual intuition we next derive a robust mathematical definition of a derivative, which we then use to differentiate some interesting functions. Finally, by studying a few examples, we develop four handy time saving rules that enable us to speed up differentiation for many common scenarios.
Multivariate calculus
Building on the foundations of the previous module, we now generalise our calculus tools to handle multivariable systems. This means we can take a function with multiple inputs and determine the influence of each of them separately. It would not be unusual for a machine learning method to require the analysis of a function with thousands of inputs, so we will also introduce the linear algebra structures necessary for storing the results of our multivariate calculus analysis in an orderly fashion.
Multivariate chain rule and its applications
Having seen that multivariate calculus is really no more complicated than the univariate case, we now focus on applications of the chain rule. Neural networks are one of the most popular and successful conceptual structures in machine learning. They are build up from a connected web of neurons and inspired by the structure of biological brains. The behaviour of each neuron is influenced by a set of control parameters, each of which needs to be optimised to best fit the data. The multivariate chain rule can be used to calculate the influence of each parameter of the networks, allow them to be updated during training.
Taylor series and linearisation
The Taylor series is a method for reexpressing functions as polynomial series. This approach is the rational behind the use of simple linear approximations to complicated functions. In this module, we will derive the formal expression for the univariate Taylor series and discuss some important consequences of this result relevant to machine learning. Finally, we will discuss the multivariate case and see how the Jacobian and the Hessian come in to play.
Intro to optimisation
If we want to find the minimum and maximum points of a function then we can use multivariate calculus to do this, say to optimise the parameters (the space) of a function to fit some data. First we’ll do this in one dimension and use the gradient to give us estimates of where the zero points of that function are, and then iterate in the NewtonRaphson method. Then we’ll extend the idea to multiple dimensions by finding the gradient vector, Grad, which is the vector of the Jacobian. This will then let us find our way to the minima and maxima in what is called the gradient descent method. We’ll then take a moment to use Grad to find the minima and maxima along a constraint in the space, which is the Lagrange multipliers method.
Regression
In order to optimise the fitting parameters of a fitting function to the best fit for some data, we need a way to define how good our fit is. This goodness of fit is called chisquared, which we’ll first apply to fitting a straight line  linear regression. Then we’ll look at how to optimise our fitting function using chisquared in the general case using the gradient descent method. Finally, we’ll look at how to do this easily in Python in just a few lines of code, which will wrap up the course.
Taught by
Samuel J. Cooper, David Dye and A. Freddie Page
Related Courses

Mathematics for Machine Learning
Imperial College London
3.0 
Calculus and Optimization for Machine Learning
Higher School of Economics

Machine Learning: Regression
University of Washington
4.6 
Mathematics for Machine Learning: Linear Algebra
Imperial College London
2.9 
Introduction to Machine Learning
Duke University
2.0 
IBM Machine Learning
IBM
Reviews
4.9 rating, based on 9 reviews

Shivani Sharma completed this course, spending 3 hours a week on it and found the course difficulty to be medium.
The course is a great introduction to how one can translate prelearned mathematical concepts into machine learning. I think it just makes you appreciate complicated mathematical equations as they are tied into neat computational applications.
For those who want an introduction to the math first, the course has plenty of explanatory videos as well. But as someone who did know the math, it just made me realize that my college math can actually be used to do something useful. 
Giuliano Lemes completed this course, spending 3 hours a week on it and found the course difficulty to be hard.
This is the best course I have done so far, the practical part of the course is wonderful, you get to program a neural network just using numpy as a help, learn to differentiate, jacobians, hessians, newton ramphson, it is a very difficult course but it compensates when you can finish it. 
Anonymous completed this course.
The teaching is clear and concise with an impressive breadth of material covered during the 6 weeks. There is an emphasis on developing intuition, and content is made highly engaging through visual descriptions of calculus techniques. Having completed the course, the understanding you are left with feels profound and rigorous. 
Anonymous completed this course.
Clear explanations, cool animations, informative interactive activities and challenging assignments.
I especially enjoyed understanding back propagation from first principals and had also never seen multivariate Taylor series before! 
Anonymous is taking this course right now.
The part taught by Dr. Samuel J. Cooper is the best course I have ever seen in Calculus. It is very important to illustrate the essence of the topic with the methods used to solve the problems. 
Sajil C K is taking this course right now.
This course is an excellent one. It helped me grasp many complex concepts in an easy way. I hope every explanation/book/video follow this style. I strongly recommend this course to anyone. 
Anonymous completed this course.
The concepts taught in the 3 courses are very relevant to Machine Learning. Professors Dye, Cooper and Deisenroth are excellent at teaching and making the material easy to understand. They make the best use of audiovisual technology I have seen in all online classes that I have taken.
I have continued to pursue machine learning education, with the mathematical foundation from these courses, it is much easier to understand how machines learn, and how to improve the performance of existing Machine Learning frameworks by proper choice of architecture and hyperparameter tuning.
I strongly recommend starting your Machine Learning education by completing this certification. 
Sagar Ladhwani completed this course and found the course difficulty to be medium.
Of all the three courses in the mathematics for ML specialization, this one was the best course since it covered all the fundamentals of Machine Learning Cost minimization algorithms and even the assignments notebooks were pretty well designed. I'd recommend this to anyone who wants to understand the math behind ML. For the course topics and applications details, go through this post  I've included all the major points:
https://www.linkedin.com/posts/sagarladhwani713b96112_mathematicsdatasciencemachinelearningactivity66449587054084136961lcR 
Benjamin Lau completed this course, spending 5 hours a week on it and found the course difficulty to be medium.
Decent exposure to the topic and introduce some common technique used in data science. Learning curve is steep if you do not have any prior knowledge in calculus