BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:Guido Montufar (UCLA)
DTSTART:20230222T160000Z
DTEND:20230222T170000Z
DTSTAMP:20260423T021058Z
UID:CompAlg/5
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/CompAlg/5/">
 Geometry and convergence of natural policy gradient methods</a>\nby Guido 
 Montufar (UCLA) as part of Machine Learning Seminar\n\n\nAbstract\nWe stud
 y the convergence of several natural policy gradient (NPG) methods in infi
 nite-horizon discounted Markov decision processes with regular policy para
 metrizations. For a variety of NPGs and reward functions we show that the 
 trajectories in state-action space are solutions of gradient flows with re
 spect to Hessian geometries\, based on which we obtain global convergence 
 guarantees and convergence rates. In particular\, we show linear convergen
 ce for unregularized and regularized NPG flows with the metrics proposed b
 y Kakade and Morimura and co-authors by observing that these arise from th
 e Hessian geometries of conditional entropy and entropy respectively. Furt
 her\, we obtain sublinear convergence rates for Hessian geometries arising
  from other convex functions like log-barriers. Finally\, we interpret the
  discrete-time NPG methods with regularized rewards as inexact Newton meth
 ods if the NPG is defined with respect to the Hessian geometry of the regu
 larizer. This yields local quadratic convergence rates of these methods fo
 r step size equal to the penalization strength. This is work with Johannes
  Müller.\n
LOCATION:https://researchseminars.org/talk/CompAlg/5/
END:VEVENT
END:VCALENDAR
