The unreasonable effectiveness of mathematics in large scale deep learning
- ๐ค Speaker: Greg Yang, Microsoft Research ๐ Website
- ๐ Date & Time: Wednesday 06 July 2022, 11:00 - 12:30
- ๐ Venue: Cambridge University Engineering Department, CBL Seminar room BE4-38
Abstract
Recently, the theory of infinite-width neural networks led to the first technology, muTransfer, for tuning enormous neural networks that are too expensive to train more than once. For example, this allowed us to tune the 6.7 billion parameter version of GPT -3 using only 7% of its pretraining compute budget, and with some asterisks, we get a performance comparable to the original GPT -3 model with twice the parameter count. In this talk, I will explain the core insight behind this theory. In fact, this is an instance of what I call the Optimal Scaling Thesis, which connects infinite-size limits for general notions of โsizeโ to the optimal design of large models in practice, illustrating a way for theory to reliably guide the future of AI. Iโll end with several concrete key mathematical research questions whose resolutions will have incredible impact on how practitioners scale up their NNs.
Thereโs no required reading for the talk but folks can look at my homepage for an overview of Tensor Programs.
Series This talk is part of the Machine Learning Reading Group @ CUED series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Cambridge University Engineering Department, CBL Seminar room BE4-38
- Cambridge University Engineering Department Talks
- Centre for Smart Infrastructure & Construction
- Chris Davis' list
- Computational Continuum Mechanics Group Seminars
- custom
- Featured lists
- Guy Emerson's list
- Hanchen DaDaDash
- Inference Group Journal Clubs
- Inference Group Summary
- Information Engineering Division seminar list
- Interested Talks
- Machine Learning Reading Group
- Machine Learning Reading Group @ CUED
- Machine Learning Summary
- ML
- ndk22's list
- ob366-ai4er
- Quantum Matter Journal Club
- Required lists for MLG
- rp587
- School of Technology
- Simon Baker's List
- TQS Journal Clubs
- Trust & Technology Initiative - interesting events
- yk373's list
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Greg Yang, Microsoft Research 
Wednesday 06 July 2022, 11:00-12:30