BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:Yuejie Chi (Carnegie Mellon University)
DTSTART:20210625T130000Z
DTEND:20210625T140000Z
DTSTAMP:20260423T003302Z
UID:MPML/49
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/MPML/49/">Po
 licy Optimization in Reinforcement Learning: A Tale of Preconditioning and
  Regularization</a>\nby Yuejie Chi (Carnegie Mellon University) as part of
  Mathematics\, Physics and Machine Learning (IST\, Lisbon)\n\n\nAbstract\n
 Policy optimization\, which learns the policy of interest by maximizing th
 e value function via large-scale optimization techniques\, lies at the hea
 rt of modern reinforcement learning (RL). In addition to value maximizatio
 n\, other practical considerations arise commonly as well\, including the 
 need of encouraging exploration\, and that of ensuring certain structural 
 properties of the learned policy due to safety\, resource and operational 
 constraints. These considerations can often be accounted for by resorting 
 to regularized RL\, which augments the target value function with a struct
 ure-promoting regularization term\, such as Shannon entropy\, Tsallis entr
 opy\, and log-barrier functions. Focusing on an infinite-horizon discounte
 d Markov decision process\, this talk first shows that entropy-regularized
  natural policy gradient methods converge globally at a linear convergence
  that is near independent of the dimension of the state-action space. Next
 \, a generalized policy mirror descent algorithm is proposed to accommodat
 e a general class of convex regularizers beyond Shannon entropy. Encouragi
 ngly\, this general algorithm inherits similar convergence guarantees\, ev
 en when the regularizer lacks strong convexity and smoothness. Our results
  accommodate a wide range of learning rates\, and shed light upon the role
  of regularization in enabling fast convergence in RL.\n
LOCATION:https://researchseminars.org/talk/MPML/49/
END:VEVENT
END:VCALENDAR
