Overview
This course on Reinforcement Learning covers the following learning outcomes and goals:
- Understanding the concept of Reinforcement Learning
- Exploring Policy Gradient and REINFORCE algorithms
- Learning methods to stabilize Reinforcement Learning
- Implementing Value-based Reinforcement Learning techniques
The course teaches the following individual skills and tools:
- Policy Gradient and REINFORCE algorithms
- Stabilization techniques for Reinforcement Learning
- Value function estimation in Value-based Reinforcement Learning
- Balancing exploration and exploitation in Reinforcement Learning
The teaching method of the course involves a lecture format with slides and code examples provided for practical implementation.
The intended audience for this course includes students and professionals interested in Neural Networks for Natural Language Processing, specifically focusing on Reinforcement Learning techniques.
Syllabus
Intro
What is reinforcement learning
Examples of reinforcement learning
Supervised Learning
Self Training
Policy Gradient
Credit assignment
Problem
Baseline
Calculating the baseline
Increasing batch size
Reinforcement Learning
Runthrough
Valuebased reinforcement learning
Estimating value functions
Exploration vs exploitation
Reinforcement learning examples
Dialogue
User simulators
Actions in spaces
Taught by
Graham Neubig