Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Online Course

Prediction and Control with Function Approximation

University of Alberta and Alberta Machine Intelligence Institute via Coursera

Overview

In this course, you will learn how to solve problems with large, high-dimensional, and potentially infinite state spaces. You will see that estimating value functions can be cast as a supervised learning problem---function approximation---allowing you to build agents that carefully balance generalization and discrimination in order to maximize reward. We will begin this journey by investigating how our policy evaluation or prediction methods like Monte Carlo and TD can be extended to the function approximation setting. You will learn about feature construction techniques for RL, and representation learning via neural networks and backprop. We conclude this course with a deep-dive into policy gradient methods; a way to learn policies directly without learning a value function. In this course you will solve two continuous-state control tasks and investigate the benefits of policy gradient methods in a continuous-action environment.

Prerequisites: This course strongly builds on the fundamentals of Courses 1 and 2, and learners should have completed these before starting this course. Learners should also be comfortable with probabilities & expectations, basic linear algebra, basic calculus, Python 3.0 (at least 1 year), and implementing algorithms from pseudocode.

By the end of this course, you will be able to:

-Understand how to use supervised learning approaches to approximate value functions
-Understand objectives for prediction (value estimation) under function approximation
-Implement TD with function approximation (state aggregation), on an environment with an infinite state space (continuous state space)
-Understand fixed basis and neural network approaches to feature construction
-Implement TD with neural network function approximation in a continuous state environment
-Understand new difficulties in exploration when moving to function approximation
-Contrast discounted problem formulations for control versus an average reward problem formulation
-Implement expected Sarsa and Q-learning with function approximation on a continuous state control task
-Understand objectives for directly estimating policies (policy gradient objectives)
-Implement a policy gradient method (called Actor-Critic) on a discrete state environment

Syllabus

Welcome to the Course!
-Welcome to the third course in the Reinforcement Learning Specialization: Prediction and Control with Function Approximation, brought to you by the University of Alberta, Onlea, and Coursera. In this pre-course module, you'll be introduced to your instructors, and get a flavour of what the course has in store for you. Make sure to introduce yourself to your classmates in the "Meet and Greet" section!

On-policy Prediction with Approximation
-This week you will learn how to estimate a value function for a given policy, when the number of states is much larger than the memory available to the agent. You will learn how to specify a parametric form of the value function, how to specify an objective function, and how estimating gradient descent can be used to estimate values from interaction with the world.

Constructing Features for Prediction
-The features used to construct the agent’s value estimates are perhaps the most crucial part of a successful learning system. In this module we discuss two basic strategies for constructing features: (1) fixed basis that form an exhaustive partition of the input, and (2) adapting the features while the agent interacts with the world via Neural Networks and Backpropagation. In this week’s graded assessment you will solve a simple but infinite state prediction task with a Neural Network and TD learning.

Control with Approximation
-This week, you will see that the concepts and tools introduced in modules two and three allow straightforward extension of classic TD control methods to the function approximation setting. In particular, you will learn how to find the optimal policy in infinite-state MDPs by simply combining semi-gradient TD methods with generalized policy iteration, yielding classic control methods like Q-learning, and Sarsa. We conclude with a discussion of a new problem formulation for RL---average reward---which will undoubtedly be used in many applications of RL in the future.

Policy Gradient
-Every algorithm you have learned about so far estimates a value function as an intermediate step towards the goal of finding an optimal policy. An alternative strategy is to directly learn the parameters of the policy. This week you will learn about these policy gradient methods, and their advantages over value-function based methods. You will also learn how policy gradient methods can be used to find the optimal policy in tasks with both continuous state and action spaces.

Taught by

Martha White and Adam White

Related Courses

Reviews

4.8 rating, based on 11 reviews

Start your review of Prediction and Control with Function Approximation

  • Anonymous

    Anonymous completed this course.

    I really enjoyed this class. A mind blowing tour of the main algorithms used for continuous online use cases. Very clearly articulated lectures. Big congrats to Martha and Adam!
  • Anonymous

    Anonymous completed this course.

    The course is very concise and to the point. It covers all the necessary aspects and tries its best to be in sync with the reinforcement learning book by Sutton. The instructors are well experienced and know how to present an idea in an easy but elegant manner. The weekly quiz and projects are challenging and will surely test the reader's understanding of the course. If you are new to reinforcement learning, I would really recommend this course along with it's two other courses before this in the Reinforcement Learning Specialization by University of Alberta. All in all you will have a great time learning this course.
  • Profile image for Luiz Cunha
    Luiz C.

    Luiz completed this course, spending 1 hours a week on it and found the course difficulty to be medium.

    Almost perfect, except two ~minor objections:

    1/ the learning content between the 4 weeks is quite unbalanced. The initial weeks of the course are well sized, whereas week #3 and week #4 feel a touch light. It feels like the Instructors rushed to make the Course available online, and didn't have time to put as much content as they wished in the last weeks of the Course

    2/ there are too many typos in some notebooks (specifically notebook of week #3). It gives the impression it was made in a rush, and nobody read over it again. Besides there seems to currently be some issue with this assignment
  • Anonymous

    Anonymous completed this course.

    Definitely a course to take to learn the ropes of RL. For this course, it is critical to follow the math. 4 stars instead of 5 only because the math could be made easier to follow with some extra effort from the tutors. But if you're strong in math, you should be fine. The math itself is not difficult, but the notation is challenging and the terminology is a bit tough to keep in head.
  • Profile image for Jose Marcos Rodríguez Fernández
    Jose F.

    Jose completed this course, spending 3 hours a week on it and found the course difficulty to be medium.

    The instructors do a great job summarizing and being concise while following Sutton & Barto's RL introduction book.

    The programming exercises, done via jupyter notebooks, really help to consolidate the theoretical knowledge acquired during the lessons and by reading the book.

    Highly recommended course for anyone interested in getting a practical introduction to RL algorithms.
  • Anonymous

    Anonymous completed this course.

    This course is very rich of both mathematical and practical concepts, and it actually provides you with powerful tools to understand and use Reinforcement Learning. So far, it is the most interesting course in this specialization. Lectures are very clear and they often explain more deeply some concepts you find in the text book. Quizzes are challenging and well constructed.
  • Anonymous

    Anonymous completed this course.

    I really enjoyed this third course of the specialisation.

    The content and explanations are very helpful in building your intuition around quite complex concepts of RL with approximation. Quizzes and programming exercises are challenging enough to help you grasp necessary concepts and get hands on experience. Look forward to the next course in the specialisation.
  • Anonymous

    Anonymous completed this course.

    This course covers a wide variety of topics and dives a good amount into each of them.
    I wish the instructors would cover some of the topic and the math in a little more detail, and some of the content seems a tiny bit rushed, but otherwise, a brilliant course overall.
  • Anonymous
    Really engaging and interesting course. Amazingly talented instructors and equally amazing content. A must for those who are learning reinforcement learning or those who want to expand their knowledge in the field.
  • Anonymous

    Anonymous completed this course.

    Amazing course with amazing, intuitive visualizations. It is clear that the instructors have spent a lot of time and effort in trying to make the course as visually descriptive as possible.
  • Anonymous

    Anonymous completed this course.

    Nice course that is part of 3 more courses. All of the together cover a wide area in RL. Beginner in maths can easily follow. It's good to know some python before you start (very basic level)

Class Central

Get personalized course recommendations, track subjects and courses with reminders, and more.

Sign up for free

Never stop learning Never Stop Learning!

Get personalized course recommendations, track subjects and courses with reminders, and more.

Sign up for free