BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Geometry and Topology of Neural Network Optimization - Joan Bruna 
 (New York University\; University of California\, Berkeley)
DTSTART:20171030T163000Z
DTEND:20171030T172000Z
UID:TALK94024@talks.cam.ac.uk
CONTACT:INI IT
DESCRIPTION:<span>Co-author: Daniel Freeman		(UC Berkeley)        <br></sp
 an><br>The loss surface of deep neural networks has recently attracted int
 erest  in the optimization and machine learning communities as a prime exa
 mple of  high-dimensional non-convex problem. Some insights were recently 
 gained using spin glass  models and mean-field approximations\, but at the
  expense of simplifying the nonlinear nature of the model. <br><br>In this
  work\, we do not make any such assumption and study conditions  on the da
 ta distribution and model architecture that prevent the existence  of bad 
 local minima. We first take a topological approach and characterize  absen
 ce of bad local minima by studying the connectedness of the loss surface l
 evel sets. Our theoretical work quantifies and formalizes two  important f
 acts: (i) the landscape of deep linear networks has a radically different 
 topology  from that of deep half-rectified ones\, and (ii) that the energy
  landscape  in the non-linear case is fundamentally controlled by the inte
 rplay between the smoothness of the data distribution and model over-param
 etrization. Our main theoretical contribution is to prove that half-rectif
 ied single layer networks are asymptotically connected\, and we provide ex
 plicit bounds that reveal the aforementioned interplay. <br><span><br>The 
 conditioning of gradient descent is the next challenge we address.  We stu
 dy this question through the geometry of the level sets\, and we introduce
  an algorithm to efficiently estimate the regularity of such sets on large
 -scale networks.  Our empirical results show that these level sets remain 
 connected throughout  all the learning phase\, suggesting a near convex be
 havior\, but they become  exponentially more curvy as the energy level dec
 ays\, in accordance to what is observed in practice with very low curvatur
 e attractors. Joint work with Daniel Freeman (UC Berkeley).&nbsp\;</span>
LOCATION:Seminar Room 1\, Newton Institute
END:VEVENT
END:VCALENDAR