Scaling Laws in Machine Learning and GPT-3

Jared Kaplan (Johns Hopkins University)

26-Jan-2021, 18:30-19:30 (3 years ago)

Abstract: A variety of recent works suggest that scaling laws are ubiquitous in machine learning. In particular, neural network performance obeys scaling laws with respect to the number of parameters, dataset size, and the training compute budget. I will explain these scaling laws, and argue that they are both precise and very universal. Then I will explain how this line of thinking led to the GPT-3 language model, and what it suggests for the future.

HEP - phenomenologyHEP - theorymathematical physics

Audience: researchers in the topic


NHETC Seminar

Series comments: Description: Weekly research seminar of the NHETC at Rutgers University

Livestream link is available on the webpage.

Organizers: Christina Pettola*, Sung Hak Lim, Vivek Saxena*, Erica DiPaola*
*contact for this listing

Export talk to