On the benefit of over-parametrization and the origin of double descent curves in artificial neural networks

Guilio Biroli (ENS Paris)

29-Jul-2020, 14:00-15:00 (6 years ago)

Abstract: Deep neural networks have triggered a revolution in machine learning, and more generally in computer science. Understanding their remarkable performance is a key scientific challenge with many open questions. For instance, practitioners find that using massively over-parameterised networks is beneficial to learning and generalization ability. This fact goes against standard theories, and defies intuition. In this talk I will address this issue. I will first contrast standard expectations based on variance-bias trade-off to the results of numerical experiments on deep neural networks, which display a “double-descent” behavior of the test error when increasing the number of parameters instead of the traditional U-curve. I will then discuss a theory of this phenomenon based on the solution of simplified models of deep neural networks by statistical physics methods.

optimization and controlstatistics theory

Audience: researchers in the topic

MAD+

Series comments: Description: Research seminar on data science

See here for Zoom links to individual seminars, links to recordings, and to subscribe to calendar or mailing list.

Organizers:	Afonso S. Bandeira*, Joan Bruna, Carlos Fernandez-Granda, Jonathan Niles-Weed, Ilias Zadik
	*contact for this listing

Export talk to