Mean-field dynamics and training of deep transformers
- ๐ค Speaker: Christoph Reisinger (University of Oxford)
- ๐ Date & Time: Monday 10 November 2025, 15:50 - 16:30
- ๐ Venue: Seminar Room 1, Newton Institute
Abstract
In this talk, we will examine continuous limits of transformer architectures, which form the basis of common generative models. There is rich literature on the limiting behaviour of neural networks, including for large width of single layer neural networks (mean-field analysis) and large depth of residual neural networks (neural ODE and SDE analysis). Here, we consider limits of transformers with attention and scaling for a large number of layers, tokens, and attention heads. The analysis reveals that for plausible training outputs, a McKean—Vlasov limit with or without diffusive common noise results. Joint work with William Gibson.
Series This talk is part of the Isaac Newton Institute Seminar Series series.
Included in Lists
- All CMS events
- bld31
- dh539
- Featured lists
- INI info aggregator
- Isaac Newton Institute Seminar Series
- School of Physical Sciences
- Seminar Room 1, Newton Institute
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Christoph Reisinger (University of Oxford)
Monday 10 November 2025, 15:50-16:30