Two-time scale stochastic approximation for reinforcement learning with linear function approximation
Pedro A. Santos (Instituto Superior Técnico and INESC-ID)
Abstract: In this presentation, I will introduce some traditional Reinforcement Learning problems and algorithms, and analyze how some problems can be avoided and convergence results obtained using a two-time scale variation of the usual stochastic approximation approach.
This variation was inspired by the practical successes of Deep Q-Learning in attaining superhuman performance at some classical Atari games by Deepmind's research team in 2015. Machine Learning practical successes like this often have no corresponding explaining theory. The work that will be presented intends to contribute to that goal.
Joint work with Diogo Carvalho and Francisco Melo from INESC-ID.
data structures and algorithmsmachine learningmathematical physicsinformation theoryoptimization and controldata analysis, statistics and probability
Audience: researchers in the topic
Mathematics, Physics and Machine Learning (IST, Lisbon)
Series comments: To receive the series announcements, please register in:
mpml.tecnico.ulisboa.pt
mpml.tecnico.ulisboa.pt/registration
Zoom link: videoconf-colibri.zoom.us/j/91599759679
Organizers: | Mário Figueiredo, Tiago Domingos, Francisco Melo, Jose Mourao*, Cláudia Nunes, Yasser Omar, Pedro Alexandre Santos, João Seixas, Cláudia Soares, João Xavier |
*contact for this listing |