BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:Arnulf Jentzen (School of Data Science & School of Artificial Inte
 lligence\, The Chinese University of Hong Kong\, Shenzhen and Institute fo
 r Analysis and Numerics\, University of Münster)
DTSTART:20260518T111500Z
DTEND:20260518T120000Z
DTSTAMP:20260417T004840Z
UID:cam/102
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/cam/102/">Co
 mprehensive convergence analysis for the Adam optimizer</a>\nby Arnulf Jen
 tzen (School of Data Science & School of Artificial Intelligence\, The Chi
 nese University of Hong Kong\, Shenzhen and Institute for Analysis and Num
 erics\, University of Münster) as part of CAM seminar\n\nLecture held in 
 MV:L14.\n\nAbstract\nIn the training of artificial intelligence (AI) syste
 ms\, often not the standard gradient descent (GD) method is the employed o
 ptimization scheme but instead suitable accelerated and/or adaptive GD met
 hods -- such as the momentum and the RMSprop methods -- are considered. Th
 e most popular of such accelerated/adaptive optimization methods is presum
 ably the Adam optimizer due to Kingma & Ba (2014). In this talk we introdu
 ce the notion of the stability region for general deep learning optimizati
 on methods and we reveal that among standard GD\, momentum\, RMSprop\, and
  Adam we have that Adam is the only optimizer that achieves the optimal hi
 gher order convergence speed and also has the maximal stability region. In
  another main result of this talk\, which we refer to as Adam symmetry the
 orem\, we show for a simple class of quadratic stochastic optimization pro
 blems (SOPs) that Adam converges\, as the number of Adam steps increases\,
  to the solution of the SOP (the unique minimizer of the strongly convex o
 bjective function) if and only if the random variables in the SOP (the dat
 a in the SOP) are symmetrically distributed. In particular\, in the standa
 rd case where the random variables in the SOP are not symmetrically distri
 buted we disprove that Adam converges to the minimizer of the SOP as the n
 umber of Adam steps increases. The talk is based on joint works with Steff
 en Dereich\, Thang Do\, Robin Graeber\, Sebastian Kassing\, Adrian Riekert
 \, and Philippe von Wurstemberger.\n
LOCATION:https://researchseminars.org/talk/cam/102/
END:VEVENT
END:VCALENDAR
