Prediction and Control with Function Approximation

University of Alberta and Alberta Machine Intelligence Institute via Coursera

Go to class Write review

Details

Go to class

Provider

Coursera
Pricing

Free Online Course (Audit)
Languages

English
Certificate

Paid Certificate Available
Duration & workload

21 hours 51 minutes
Sessions

On-Demand
Level

Intermediate
Subtitles

Arabic, French, Portuguese, Italian, German, Russian, English, Spanish, Thai, Indonesian, Kazakh, Hindi, Swedish, Korean, Greek, Chinese, Ukrainian, Japanese, Polish, Dutch, Turkish, Hungarian, Bengali, Pashto, Urdu, Azerbaijani, Farsi

Found in

Part of

Reinforcement Learning

3.5

Overview

Class Central Tips

In this course, you will learn how to solve problems with large, high-dimensional, and potentially infinite state spaces. You will see that estimating value functions can be cast as a supervised learning problem---function approximation---allowing you to build agents that carefully balance generalization and discrimination in order to maximize reward. We will begin this journey by investigating how our policy evaluation or prediction methods like Monte Carlo and TD can be extended to the function approximation setting. You will learn about feature construction techniques for RL, and representation learning via neural networks and backprop. We conclude this course with a deep-dive into policy gradient methods; a way to learn policies directly without learning a value function. In this course you will solve two continuous-state control tasks and investigate the benefits of policy gradient methods in a continuous-action environment. Prerequisites: This course strongly builds on the fundamentals of Courses 1 and 2, and learners should have completed these before starting this course. Learners should also be comfortable with probabilities & expectations, basic linear algebra, basic calculus, Python 3.0 (at least 1 year), and implementing algorithms from pseudocode. By the end of this course, you will be able to: -Understand how to use supervised learning approaches to approximate value functions -Understand objectives for prediction (value estimation) under function approximation -Implement TD with function approximation (state aggregation), on an environment with an infinite state space (continuous state space) -Understand fixed basis and neural network approaches to feature construction -Implement TD with neural network function approximation in a continuous state environment -Understand new difficulties in exploration when moving to function approximation -Contrast discounted problem formulations for control versus an average reward problem formulation -Implement expected Sarsa and Q-learning with function approximation on a continuous state control task -Understand objectives for directly estimating policies (policy gradient objectives) -Implement a policy gradient method (called Actor-Critic) on a discrete state environment

Syllabus

Welcome to the Course!

Welcome to the third course in the Reinforcement Learning Specialization: Prediction and Control with Function Approximation, brought to you by the University of Alberta, Onlea, and Coursera. In this pre-course module, you'll be introduced to your instructors, and get a flavour of what the course has in store for you. Make sure to introduce yourself to your classmates in the "Meet and Greet" section!

On-policy Prediction with Approximation

This week you will learn how to estimate a value function for a given policy, when the number of states is much larger than the memory available to the agent. You will learn how to specify a parametric form of the value function, how to specify an objective function, and how estimating gradient descent can be used to estimate values from interaction with the world.

Constructing Features for Prediction

The features used to construct the agent’s value estimates are perhaps the most crucial part of a successful learning system. In this module we discuss two basic strategies for constructing features: (1) fixed basis that form an exhaustive partition of the input, and (2) adapting the features while the agent interacts with the world via Neural Networks and Backpropagation. In this week’s graded assessment you will solve a simple but infinite state prediction task with a Neural Network and TD learning.

Control with Approximation

This week, you will see that the concepts and tools introduced in modules two and three allow straightforward extension of classic TD control methods to the function approximation setting. In particular, you will learn how to find the optimal policy in infinite-state MDPs by simply combining semi-gradient TD methods with generalized policy iteration, yielding classic control methods like Q-learning, and Sarsa. We conclude with a discussion of a new problem formulation for RL---average reward---which will undoubtedly be used in many applications of RL in the future.

Policy Gradient

Every algorithm you have learned about so far estimates a value function as an intermediate step towards the goal of finding an optimal policy. An alternative strategy is to directly learn the parameters of the policy. This week you will learn about these policy gradient methods, and their advantages over value-function based methods. You will also learn how policy gradient methods can be used to find the optimal policy in tasks with both continuous state and action spaces.

Taught by

Martha White and Adam White

Reviews

4.7 rating, based on 22 Class Central reviews

4.8 rating at Coursera based on 808 ratings

Start your review of Prediction and Control with Function Approximation

Anonymous

It was an excellent course. I enjoyed a lot the learning. The concepts were very helpful and useful. It would be better if the libraries used were open-sourced or could be taught for students to build themselves, so that the students could be applying the learnings to their real-world problems.
Anonymous

BEFORE this course: I’ve done a number of Coursera courses before. Whilst they are good, the level of learning tends to be superficial. THIS course the third of four courses. These are the best courses I’ve taken and I now feel I have learnt a ve…

BEFORE this course: I’ve done a number of Coursera courses before. Whilst they are good, the level of learning tends to be superficial.

THIS course the third of four courses. These are the best courses I’ve taken and I now feel I have learnt a very significant amount. Below applies to all three courses.

I have seen someone criticize the course by saying ‘it is just them talking through the Sutton & Barto book’. In defense: (i) the book *does* seem to be *the* seminal introductory text, (ii) they have selected which bits to cut out, (iii) I have now read through the recommended chapters as part of the course and have far greater insight.

In some cases, the slides show things clearer than in the book. In some cases, the sentences are far too complicated to digest oinne one go. You need to rewind again and again to understand things.

I have found the time taken to do the assignments to be much longer than they suggest but this is largely down to my lack of Python abilities. I lot of time was spent improving my Python – which was a good by-product. The intermediate checking of code (within Jupyter) could be better but the forums help.

Thanks!
Anonymous

The course is really fine. I suggest you to further improve the Tile coding section and in the assignment 4, the computation of delta is quite confusing.
Anonymous

The community support for the course, and in fact for the entire specialization is next to negligible. More than 50% of the course content is simply based on re-iterating what has been mentioned in the reference book, and the explanations are even more shallow than what have been presented in the book.
Anonymous

As with the two prior classes in this specialization, I really appreciated this third class on prediction and control with function approximation. The lectures really help clarify the material that is presented in the book and the programming assignments and quizzes challenge you to understand the equations and how the updates are calculated. I actually figured out the value of subtracting off the baseline from softmax in a very real way during the quiz when I was calculating valued like e^(-44) and e^(-42) instead of "1" and e^(-2). Examples make it real.... Thanks so much to Martha and Adam for the effort they put into presenting the material in a clear way.
Anonymous

I really enjoyed this class. A mind blowing tour of the main algorithms used for continuous online use cases. Very clearly articulated lectures. Big congrats to Martha and Adam!
Anonymous

Very good course for learning Reinforcement Learning. The instructors are very good and the approach of teaching is bust suited to understand the subject properly.
Anonymous

Best online course I've taken. Everything is extremely clear, easy to understand, optimally challenging and fun.
Anonymous

Excellent instructors!

And the textbook Reinforcement Learning: An Introduction is a masterpiece in itself.
Anonymous

The course is very concise and to the point. It covers all the necessary aspects and tries its best to be in sync with the reinforcement learning book by Sutton. The instructors are well experienced and know how to present an idea in an easy but elegant manner. The weekly quiz and projects are challenging and will surely test the reader's understanding of the course. If you are new to reinforcement learning, I would really recommend this course along with it's two other courses before this in the Reinforcement Learning Specialization by University of Alberta. All in all you will have a great time learning this course.
Luiz Cunha

Almost perfect, except two ~minor objections:

1/ the learning content between the 4 weeks is quite unbalanced. The initial weeks of the course are well sized, whereas week #3 and week #4 feel a touch light. It feels like the Instructors rushed to make the Course available online, and didn't have time to put as much content as they wished in the last weeks of the Course

2/ there are too many typos in some notebooks (specifically notebook of week #3). It gives the impression it was made in a rush, and nobody read over it again. Besides there seems to currently be some issue with this assignment
Anonymous

Definitely a course to take to learn the ropes of RL. For this course, it is critical to follow the math. 4 stars instead of 5 only because the math could be made easier to follow with some extra effort from the tutors. But if you're strong in math, you should be fine. The math itself is not difficult, but the notation is challenging and the terminology is a bit tough to keep in head.
Jose Marcos Rodríguez Fernández

The instructors do a great job summarizing and being concise while following Sutton & Barto's RL introduction book.

The programming exercises, done via jupyter notebooks, really help to consolidate the theoretical knowledge acquired during the lessons and by reading the book.

Highly recommended course for anyone interested in getting a practical introduction to RL algorithms.
Anonymous

This course is very rich of both mathematical and practical concepts, and it actually provides you with powerful tools to understand and use Reinforcement Learning. So far, it is the most interesting course in this specialization. Lectures are very clear and they often explain more deeply some concepts you find in the text book. Quizzes are challenging and well constructed.
Anonymous

I really enjoyed this third course of the specialisation.

The content and explanations are very helpful in building your intuition around quite complex concepts of RL with approximation. Quizzes and programming exercises are challenging enough to help you grasp necessary concepts and get hands on experience. Look forward to the next course in the specialisation.
Anonymous

I really enjoyed taking this course and learned a lot. The Reinforcement Learning Specialization (https://www.coursera.org/specializations/reinforcement-learning) is a great introduction to reinforcement learning. This course is the third one in the specialization. All programming assignments are in Python.
Anonymous

This course covers a wide variety of topics and dives a good amount into each of them.
I wish the instructors would cover some of the topic and the math in a little more detail, and some of the content seems a tiny bit rushed, but otherwise, a brilliant course overall.
Anonymous

Really engaging and interesting course. Amazingly talented instructors and equally amazing content. A must for those who are learning reinforcement learning or those who want to expand their knowledge in the field.
Anonymous

Amazing course with amazing, intuitive visualizations. It is clear that the instructors have spent a lot of time and effort in trying to make the course as visually descriptive as possible.
Anonymous

Nice course that is part of 3 more courses. All of the together cover a wide area in RL. Beginner in maths can easily follow. It's good to know some python before you start (very basic level)

Go to class

Udemy, Coursera, 2U/edX Face Lawsuits Over Meta Pixel Use

Most common

Popular subjects

Popular courses

Prediction and Control with Function Approximation

Overview

Syllabus

Taught by

Tags

Reviews

Udemy, Coursera, 2U/edX Face Lawsuits Over Meta Pixel Use

Taught by

Tags

Sample-based Learning Methods

Reinforcement Learning

Fundamentals of Reinforcement Learning

A Complete Reinforcement Learning System (Capstone)

AI skills: Introduction to Unsupervised, Deep and Reinforcement Learning

Decision Making and Reinforcement Learning

10 Best Applied AI & ML Courses

1700 Coursera Courses That Are Still Completely FREE

250 Top FREE Coursera Courses of All Time

Massive List of MOOC-based Microcredentials

Never Stop Learning.