Learning linear models in-context with transformers
- đ¤ Speaker: Spencer Frei, UC Davis
- đ Date & Time: Wednesday 25 October 2023, 11:00 - 12:30
- đ Venue: Cambridge University Engineering Department, CBL Seminar room BE4-38.
Abstract
Attention-based neural network sequence models such as transformers have the capacity to act as supervised learning algorithms: They can take as input a sequence of labeled examples and output predictions for unlabeled test examples. Indeed, recent work by Garg et al. has shown that when training GPT2 architectures over random instances of linear regression problems, these models’ predictions mimic those of ordinary least squares. Towards understanding the mechanisms underlying this phenomenon, we investigate the dynamics of in-context learning of linear predictors for a transformer with a single linear self-attention layer trained by gradient flow. We show that despite the non-convexity of the underlying optimization problem, gradient flow with a random initialization finds a global minimum of the objective function. Moreover, when given a prompt of labeled examples from a new linear prediction task, the trained transformer achieves small prediction error on unlabeled test examples. We further characterize the behavior of the trained transformer under distribution shifts.
Bio: Spencer Frei is an Assistant Professor of Statistics at UC Davis. His research is on the foundations of deep learning, including topics related to benign overfitting, implicit regularization, and large language models. Prior to joining UC Davis he was a postdoctoral fellow at UC Berkeley hosted by Peter Bartlett and Bin Yu. He was a co-organizer of the 2022 Deep Learning Theory Workshop and Summer School at the Simons Institute for the Theory of Computing. He received his Ph.D in Statistics from UCLA in 2021 under the co-supervision of Quanquan Gu and Ying Nian Wu.
Series This talk is part of the Machine Learning Reading Group @ CUED series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Cambridge University Engineering Department, CBL Seminar room BE4-38.
- Cambridge University Engineering Department Talks
- Centre for Smart Infrastructure & Construction
- Chris Davis' list
- Computational Continuum Mechanics Group Seminars
- custom
- Featured lists
- Guy Emerson's list
- Hanchen DaDaDash
- Inference Group Journal Clubs
- Inference Group Summary
- Information Engineering Division seminar list
- Interested Talks
- Machine Learning Reading Group
- Machine Learning Reading Group @ CUED
- Machine Learning Summary
- ML
- ndk22's list
- ob366-ai4er
- Quantum Matter Journal Club
- Required lists for MLG
- rp587
- School of Technology
- Simon Baker's List
- TQS Journal Clubs
- Trust & Technology Initiative - interesting events
- yk373's list
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Spencer Frei, UC Davis
Wednesday 25 October 2023, 11:00-12:30