BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:Ben Edelman (Harvard University)
DTSTART:20230209T170000Z
DTEND:20230209T180000Z
DTSTAMP:20260423T003241Z
UID:MPML/98
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/MPML/98/">St
 udies in feature learning through the lens of sparse boolean functions</a>
 \nby Ben Edelman (Harvard University) as part of Mathematics\, Physics and
  Machine Learning (IST\, Lisbon)\n\n\nAbstract\nHow do deep neural network
 s learn to construct useful features? Why do self-attention-based networks
  such as transformers perform so well on combinatorial tasks such as langu
 age learning? Why do some capabilities of networks emerge "discontinuously
 " as the computational resources used for training are scaled up? We will 
 present perspectives on these questions through the lens of a particular c
 lass of simple synthetic tasks: learning sparse boolean functions. In part
  one\, we will show that the hypothesis class of one-layer transformers ca
 n learn these functions in a statistically efficient manner. This leads to
  a view of each layer of a transformer as creating new "variables" out of 
 sparse combinations of the previous layer's outputs. In part two\, we will
  focus on the classic task of learning sparse parities\, which is statisti
 cally easy but computationally difficult. We will demonstrate that SGD on 
 various neural networks (transformers\, MLPs\, etc.) successfully learns s
 parse parities\, with computational efficiency that is close to known lowe
 r bounds. Moreover\, the training curves display no apparent progress for 
 a long time\, and then quickly drop late in training. We show that despite
  this apparent delayed breakthrough in performance\, hidden progress is ac
 tually being made throughout the course of training.\n\nBased on joint wor
 k with Surbhi Goel\, Sham Kakade\, Cyril Zhang\, Boaz Barak\, and Eran Mal
 ach:\n\nhttps://arxiv.org/abs/2110.10090\n\nhttps://arxiv.org/abs/2207.087
 99\n
LOCATION:https://researchseminars.org/talk/MPML/98/
END:VEVENT
END:VCALENDAR
