How does gradient descent work?
- 👤 Speaker: Jeremy Cohen (Flatiron)
- 📅 Date & Time: Friday 12 December 2025, 14:30 - 15:30
- 📍 Venue: Cambridge University Engineering Department, CBL Seminar room BE4-38.
Abstract
Optimization is the engine of deep learning, yet the theory of optimization has had little impact on the practice of deep learning. Why? In this talk, we will first show that traditional theories of optimization cannot explain the convergence of the simplest optimization algorithm — deterministic gradient descent — in deep learning. Whereas traditional theories assert that gradient descent converges because the curvature of the loss landscape is “a priori” small, we will explain how in reality, gradient descent converges because it dynamically avoids high-curvature regions of the loss landscape. Understanding this behavior requires Taylor expanding to third order, which is one order higher than normally used in optimization theory. While the “fine-grained” dynamics of gradient descent involve chaotic oscillations that are difficult to analyze, we will demonstrate that the “time-averaged” dynamics are, fortunately, much more tractable. We will present an analysis of these time-averaged dynamics that yields highly accurate quantitative predictions in a variety of deep learning settings. Since gradient descent is the simplest optimization algorithm, we hope this analysis can help point the way towards a mathematical theory of optimization in deep learning.
Bio: Jeremy Cohen is a research fellow at the Flatiron Institute. He has recently been working on understanding optimization in deep learning. He obtained his PhD in 2024 from Carnegie Mellon University, advised by Zico Kolter and Ameet Talwalkar.
Series This talk is part of the Machine Learning Reading Group @ CUED series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Cambridge University Engineering Department, CBL Seminar room BE4-38.
- Cambridge University Engineering Department Talks
- Centre for Smart Infrastructure & Construction
- Chris Davis' list
- Computational Continuum Mechanics Group Seminars
- custom
- Featured lists
- Guy Emerson's list
- Hanchen DaDaDash
- Inference Group Journal Clubs
- Inference Group Summary
- Information Engineering Division seminar list
- Interested Talks
- Machine Learning Reading Group
- Machine Learning Reading Group @ CUED
- Machine Learning Summary
- ML
- ndk22's list
- ob366-ai4er
- Quantum Matter Journal Club
- Required lists for MLG
- rp587
- School of Technology
- Simon Baker's List
- TQS Journal Clubs
- Trust & Technology Initiative - interesting events
- yk373's list
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Jeremy Cohen (Flatiron)
Friday 12 December 2025, 14:30-15:30