This lecture presents Uri Sherman's research on "Convergence of Policy Mirror Descent Beyond Compatible Function Approximation" at the HUJI Machine Learning Club. Discover how Policy Mirror Descent (PMD) algorithms can converge effectively in reinforcement learning beyond traditional constraints. Learn about new theoretical results that establish convergence rates independent of state space cardinality for general policy classes under a variational gradient dominance condition. Understand the innovative analysis technique that frames PMD as a proximal point algorithm in non-Euclidean space with adaptive proximal operators. Uri Sherman, a fifth-year PhD student at Tel-Aviv University advised by Yishay Mansour and Tomer Koren, brings experience from both industry and academia to this technical presentation. The lecture takes place on Thursday, April 3rd, 2025, at 10:30 AM in room B220.
Convergence of Policy Mirror Descent Beyond Compatible Function Approximation (Hebrew)
HUJI Machine Learning Club via YouTube
Overview
Syllabus
Thursday, April 3rd, 2025, 10:30 AM, room B220
Taught by
HUJI Machine Learning Club