BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY: Learning shallow neural networks in high dimensions: SGD dynamics
  and scaling laws - Denny Wu\, Faculty Fellow at the Center for Data Scien
 ce\, New York University and the Flatiron Institute.
DTSTART:20251120T140000Z
DTEND:20251120T150000Z
UID:TALK240625@talks.cam.ac.uk
CONTACT:Fernando Ruiz Mazo
DESCRIPTION:*Abstract*: We study the sample and time complexity of online 
 stochastic gradient descent (SGD) in learning a two-layer neural network w
 ith M orthogonal neurons on isotropic Gaussian data. We focus on the chall
 enging “extensive-width” regime M≫1 and allow for large condition nu
 mber in the second-layer parameters\, covering the power-law scaling a_m= 
 m^{-β} as a special case. We characterize the SGD dynamics for the traini
 ng of a student two-layer neural network and identify sharp transition tim
 es for the recovery of each signal direction. In the power-law setting\, o
 ur analysis entails that while the learning of individual teacher neurons 
 exhibits abrupt phase transitions\, the juxtaposition of emergent learning
  curves at different timescales results in a smooth scaling law in the cum
 ulative objective. \n\n*This talk is co-hosted by the Computer Laboratory 
 AI Research Group and the Informed-AI Hub.*\n
LOCATION:MR14\, Centre for Mathematical Sciences
END:VEVENT
END:VCALENDAR
