Learning Dynamics of Wide, Deep Neural Networks: Beyond the Limit of Infinite Width

Yasaman Bahri (Google Brain)

01-Jul-2020, 17:15-17:40 (6 years ago)

Abstract: While many practical advancements in deep learning have been made in recent years, a scientific, and ideally theoretical, understanding of modern neural networks is still in its infancy. At the heart of this would be to better understand the learning dynamics of such systems. In a first step towards tackling this problem, one can try to identify limits that have theoretical tractability and are potentially practically relevant. I’ll begin by surveying our body of work that has investigated the infinite width limit of deep networks. These results establish exact mappings between deep networks and other, existing machine learning methods (namely, Gaussian processes and kernel methods) but with novel modifications to them that had not been previously encountered. With these exact mappings in hand, the natural question is to what extent they bear relevance to neural networks at finite width. I’ll argue that the choice of learning rate is a crucial factor in dynamics away from this limit and naturally classifies deep networks into two classes separated by a sharp phase transition. This is elucidated in a class of solvable simple models we present, which give quantitative predictions for the two phases. Quite remarkably, we test these empirically in practical settings and find excellent agreement.

machine learningdynamical systemsapplied physics

Audience: researchers in the topic


Workshop on Scientific-Driven Deep Learning (SciDL)

Series comments: When: 8:00-14:30pm (PST) on Wednesday July 1, 2020 Where: berkeley.zoom.us/j/95609096856 Details: scidl.netlify.app/

Organizers: N. Benjamin Erichson*, Michael Mahoney, Steven Brunton, Nathan Kutz
*contact for this listing

Export talk to