BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Metrized Deep Learning: Fast &amp\; Scalable Training - Jeremy Ber
 nstein (MIT)
DTSTART:20250214T120000Z
DTEND:20250214T130000Z
UID:TALK225838@talks.cam.ac.uk
CONTACT:Suchir Salhan
DESCRIPTION:We build neural networks in a modular and programmatic way usi
 ng software libraries like PyTorch and JAX. But optimization theory has no
 t caught up to the flexibility of this paradigm\, and practical advances i
 n neural net optimization are largely heuristics driven. In this talk we a
 rgue that\, if we are to treat deep learning rigorously\, then we must bui
 ld our optimization theory programmatically and in lockstep with the neura
 l network itself. To instantiate this idea\, we propose the "modular norm"
 \, which is a norm on the weight space of general neural architectures. Th
 e modular norm is constructed by stitching together norms on individual te
 nsor spaces as the architecture is constructed. The modular norm has sever
 al applications: automatic Lipschitz certificates for general architecture
 s in both weights and inputs\; automatic learning rate transfer across sca
 le\; more recently\, we built the "duality theory" for the modular norm\, 
 leading to dualized optimizers like Muon\, which have set speed records fo
 r training transformers. We are building the theory of the modular norm in
 to a software library called Modula to ease the development and deployment
  of rigorous deep learning algorithms---you can find out more at https://m
 odula.systems/.\n\n
LOCATION:ONLINE ONLY. Here is the Zoom link: https://cam-ac-uk.zoom.us/j/4
 751389294?pwd=Z2ZOSDk0eG1wZldVWG1GVVhrTzFIZz09
END:VEVENT
END:VCALENDAR
