- foundations of RL methods: value/policy iteration, q-learning, policy gradient, etc.
--- with math & batteries included
- using deep neural networks for RL tasks
--- also known as "the hype train"
- state of the art RL algorithms
--- and how to apply duct tape to them for practical problems.
- and, of course, teaching your neural network to play games
--- because that's what everyone thinks RL is about. We'll also use it for seq2seq and contextual bandits.
Jump in. It's gonna be fun!
Do you have technical problems? Write to us: email@example.com
Intro: why should I care?
In this module we are gonna define and "taste" what reinforcement learning is about. We'll also learn one simple algorithm that can solve reinforcement learning problems with embarrassing efficiency.
At the heart of RL: Dynamic Programming
This week we'll consider the reinforcement learning formalisms in a more rigorous, mathematical way. You'll learn how to effectively compute the return your agent gets for a particular action - and how to pick best actions based on that return.
This week we'll find out how to apply last week's ideas to the real world problems: ones where you don't have a perfect model of your environment.
Approximate Value Based Methods
This week we'll learn to scale things even farther up by training agents based on neural networks.
We spent 3 previous modules working on the value-based methods: learning state values, action values and whatnot. Now's the time to see an alternative approach that doesn't require you to predict all future rewards to learn something.
In this final week you'll learn how to build better exploration strategies with a focus on contextual bandit setup. In honor track, you'll also learn how to apply reinforcement learning to train structured deep learning models.
Abhilash Vj is taking this course right now, spending 8 hours a week on it and found the course difficulty to be medium.
I have tried to follow CS294 from UC Berkely, tried watching David Silver lecture videos and John Schulman lectures and I struggled to understand the practical implementations of all those algorithms but this course we jump to a practical assignment after...
I have tried to follow CS294 from UC Berkely, tried watching David Silver lecture videos and John Schulman lectures and I struggled to understand the practical implementations of all those algorithms but this course we jump to a practical assignment after most lectures and that helped me gain a practical sense of all that is taught and kept me heavily motivated. I binge watched the videos and did programming assignments the weekend I got access to the course. I think this course can be what Andrew Ng's course is for machine learning to Reinforcement learning.
This course have honors materials and some extras in each week which gives the links to the materials from CS294 , John Schulman ,David silver and other blogs and related materials.That means this is more accessible to many and has got materials for people who wish to go deeper.You will code DQNs to play atari game Breakout in week 4 and you'll build a deep reinforcement learning agent for atari KungFuMaster and train it with advantage actor-critic in week 5.This course covers value iteration, policy iteration, Q-learning, SARSA, expected SARSA, Monte Carlo and Temporal Difference Methods, Approximate Q-learning, REINFORCE, DQN,A3C, Monte Carlo Tree Search , if you go honors and still you will have access to all the materials even if you take the normal path.Hope you will enjoy the course.Thank you for reading.
Francesco R completed this course, spending 9 hours a week on it and found the course difficulty to be medium.
The course well deserves five, or even six, stars for offering this content. Despite the continue fanfares on media and SNS, RL and deep RL are almost never covered by MOOCs, and this course goes even beyond being a “notable exception”. The problems that...
The course well deserves five, or even six, stars for offering this content. Despite the continue fanfares on media and SNS, RL and deep RL are almost never covered by MOOCs, and this course goes even beyond being a “notable exception”. The problems that have been prepared and the assignments based on OpenAI gym are really challenging and entertaining. “Practical” is really a proper attribute of this course, and this does not subtract to the quality of content, as the lecturers provided plenty of links to state-of-the-art techniques - and many assignments make use of discoveries that are just two-three years old.
Alas, at this time it seems however that the bulk of this course on Coursera is just a port of an already existing one on Github. Which would not be bad per se, if only the assignment environment and the graders worked properly...
You need a lot of commitment in order to make everything work - possibly getting help from the discussion platform - and the best bet is to download everything and work offline. Prepare to install or reinstall a lot of Python modules, especially if you work from Windows.
At the time I am writing this, most images within the Jupyter are also not showing and you need to fetch them from the original Github course.