BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Convex and non-convex worlds in machine learning - Anna Choromansk
 a (New York University)
DTSTART:20150701T100000Z
DTEND:20150701T110000Z
UID:TALK59569@talks.cam.ac.uk
CONTACT:Dr Jes Frellsen
DESCRIPTION:Title: Convex and non-convex worlds in machine learning\n \nAb
 stract:\n \nThe talk will focus on the modern challenges in machine learni
 ng: designing good and efficient problem-specific solvers\, designing good
  problem-specific objectives and building understanding of non-convex deep
  learning optimization. In machine learning there is a plethora of approac
 hes when convexity is desired to solve a given problem due to the existenc
 e of unique global minimum. Convex problems give rise to theoretical guara
 ntees and can typically be efficiently solved. In the first part of the ta
 lk an example of recently developed convex approach will be discussed\, wh
 ich come with strong theoretical guarantees\, where learning is done via r
 eduction to convex problem. First\, we show the construction of a new solv
 er for the partition function-based optimization which reduces the problem
  to quadratic optimization. Various applications of this variational bound
  will be discussed. The experimental results will show advantages of the p
 roposed method over state-of-the-art optimization techniques and furthermo
 re will run counter to the conventional wisdom that machine learning probl
 ems are best handled via generic optimization tools. The next part of the 
 talk will extend the previous setting by showing how to use efficient solv
 ers to more general class of problems. The talk will focus on the multi-cl
 ass setting. A reduction of this problem to a set of binary classification
  problems organized in a tree structure will be discussed and a new top-do
 wn criterion for purification of labels will be presented which guarantees
  train and test running times that are logarithmic in the label complexity
 . \n \nDiscussed approaches either live in the world of convex optimizatio
 n and/or come with theoretical guarantees. Despite the success of convex m
 ethods\, deep learning methods\, where the objective is inherently highly 
 non-convex\, have enjoyed a resurgence of interest in the last few years a
 nd they achieve state-of-the-art performance. In the last part of the talk
  we move to the world of non-convex optimization where recent findings sug
 gest that we might eventually be able to describe these approaches theoret
 ically. The connection between the highly non-convex loss function of a si
 mple model of the fully-connected feed-forward neural network and the Hami
 ltonian of the spherical spin-glass model will be established. It will be 
 shown that i) for large-size networks\, most local minima are equivalent a
 nd yield similar performance on a test set\, (ii) the probability of findi
 ng a “bad” (high value) local minimum is non-zero for small-size netwo
 rks and decreases quickly with network size\, (iii) struggling to find the
  global minimum on the training set (as opposed to one of the many good lo
 cal ones) is not useful in practice and may lead to overfitting.\n\n \nBio
 : \n \nAnna Choromanska is a Post-Doctoral Associate in the Computer Scien
 ce Department at Courant Institute of Mathematical Sciences\, New York Uni
 versity. She is working in the Computational and Biological Learning Lab\,
  which is a part of Computational Intelligence\, Learning\, Vision\, and R
 obotics Lab\, of prof. Yann LeCun. She graduated with her PhD from Columbi
 a University\, Department of Electrical Engineering\, where she was the Th
 e Fu Foundation School of Engineering and Applied Science Presidential Fel
 lowship holder. She was advised by prof. Tony Jebara. She completed her MS
 c with distinctions in the Department of Electronics and Information Techn
 ology\, Warsaw University of Technology with double specialization\, Elect
 ronics and Computer Engineering and Electronics and Informatics in Medicin
 e. She was working with various industrial institutions\, including AT&T S
 hannon Research Laboratories\, IBM T.J. Watson Reseatch Center and Microso
 ft Research New York. Her  research interests are in machine learning\, op
 timization and statistics with applications in biomedicine and neurobiolog
 y. She also holds a music degree from Mieczyslaw Karlowicz Music School in
  Warsaw\, Department of Piano Play. She is an avid salsa dancer performing
  with the Ache Performance Group. Her other hobbies is painting and photog
 raphy.\n
LOCATION:Engineering Department\, CBL Room BE-438
END:VEVENT
END:VCALENDAR
