BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Sharp Characterization and Control of Global Dynamics of SGDs with
  Heavy Tails - Xingyu Wang (Northwestern University)
DTSTART:20240425T084500Z
DTEND:20240425T093000Z
UID:TALK214201@talks.cam.ac.uk
DESCRIPTION:The empirical success of deep learning is often attributed to 
 the mysterious ability of stochastic gradient descents (SGDs) to avoid sha
 rp local minima in the loss landscape\, as sharp minima are believed to le
 ad to poor generalization. To unravel this mystery and potentially further
  enhance such capability of SGDs\, it is imperative to go beyond the tradi
 tional local convergence analysis and obtain a comprehensive understanding
  of SGDs' global dynamics within complex non-convex loss landscapes. In th
 is talk\, we characterize the global dynamics of SGDs through the heavy-ta
 iled large deviations and local stability framework. This framework system
 atically characterizes the rare events in heavy-tailed dynamical systems\;
  building on this\, we characterize intricate phase transitions in the fir
 st exit times\, which leads to the heavy-tailed counterparts of the classi
 cal Freidlin-Wentzell and Eyring-Kramers theories. Moreover\, applying thi
 s framework to SGD\, we reveal a fascinating phenomenon in deep learning: 
 by injecting and then truncating heavy-tailed noises during the training p
 hase\, SGD can almost completely avoid sharp minima and hence achieve bett
 er generalization performance for the test data.\n&nbsp\;
LOCATION:External
END:VEVENT
END:VCALENDAR
