The generalization error of overparametrized models: Insights from exact asymptotics

Andrea Montanari (Stanford)

10-Jun-2020, 14:00-15:00 (6 years ago)

Abstract: In a canonical supervised learning setting, we are given n data samples, each comprising a feature vector and a label, or response variable. We are asked to learn a function f that can predict the the label associated to a new –unseen– feature vector. How is it possible that the model learnt from observed data generalizes to new points? Classical learning theory assumes that data points are drawn i.i.d. from a common distribution and argue that this phenomenon is a consequence of uniform convergence: the training error is close to its expectation uniformly over all models in a certain class. Modern deep learning systems appear to defy this viewpoint: they achieve training error that is significantly smaller than the test error, and yet generalize well to new data. I will present a sequence of high-dimensional examples in which this phenomenon can be understood in detail. [Based on joint work wit

optimization and controlstatistics theory

Audience: researchers in the topic


MAD+

Series comments: Description: Research seminar on data science

See here for Zoom links to individual seminars, links to recordings, and to subscribe to calendar or mailing list.

Organizers: Afonso S. Bandeira*, Joan Bruna, Carlos Fernandez-Granda, Jonathan Niles-Weed, Ilias Zadik
*contact for this listing

Export talk to