Convex and non-convex worlds in machine learning
- 👤 Speaker: Anna Choromanska (New York University) 🔗 Website
- 📅 Date & Time: Wednesday 01 July 2015, 11:00 - 12:00
- 📍 Venue: Engineering Department, CBL Room BE-438
Abstract
Title: Convex and non-convex worlds in machine learning
Abstract:
The talk will focus on the modern challenges in machine learning: designing good and efficient problem-specific solvers, designing good problem-specific objectives and building understanding of non-convex deep learning optimization. In machine learning there is a plethora of approaches when convexity is desired to solve a given problem due to the existence of unique global minimum. Convex problems give rise to theoretical guarantees and can typically be efficiently solved. In the first part of the talk an example of recently developed convex approach will be discussed, which come with strong theoretical guarantees, where learning is done via reduction to convex problem. First, we show the construction of a new solver for the partition function-based optimization which reduces the problem to quadratic optimization. Various applications of this variational bound will be discussed. The experimental results will show advantages of the proposed method over state-of-the-art optimization techniques and furthermore will run counter to the conventional wisdom that machine learning problems are best handled via generic optimization tools. The next part of the talk will extend the previous setting by showing how to use efficient solvers to more general class of problems. The talk will focus on the multi-class setting. A reduction of this problem to a set of binary classification problems organized in a tree structure will be discussed and a new top-down criterion for purification of labels will be presented which guarantees train and test running times that are logarithmic in the label complexity.
Discussed approaches either live in the world of convex optimization and/or come with theoretical guarantees. Despite the success of convex methods, deep learning methods, where the objective is inherently highly non-convex, have enjoyed a resurgence of interest in the last few years and they achieve state-of-the-art performance. In the last part of the talk we move to the world of non-convex optimization where recent findings suggest that we might eventually be able to describe these approaches theoretically. The connection between the highly non-convex loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model will be established. It will be shown that i) for large-size networks, most local minima are equivalent and yield similar performance on a test set, (ii) the probability of finding a “bad” (high value) local minimum is non-zero for small-size networks and decreases quickly with network size, (iii) struggling to find the global minimum on the training set (as opposed to one of the many good local ones) is not useful in practice and may lead to overfitting.
Bio:
Anna Choromanska is a Post-Doctoral Associate in the Computer Science Department at Courant Institute of Mathematical Sciences, New York University. She is working in the Computational and Biological Learning Lab, which is a part of Computational Intelligence, Learning, Vision, and Robotics Lab, of prof. Yann LeCun. She graduated with her PhD from Columbia University, Department of Electrical Engineering, where she was the The Fu Foundation School of Engineering and Applied Science Presidential Fellowship holder. She was advised by prof. Tony Jebara. She completed her MSc with distinctions in the Department of Electronics and Information Technology, Warsaw University of Technology with double specialization, Electronics and Computer Engineering and Electronics and Informatics in Medicine. She was working with various industrial institutions, including AT&T Shannon Research Laboratories, IBM T .J. Watson Reseatch Center and Microsoft Research New York. Her research interests are in machine learning, optimization and statistics with applications in biomedicine and neurobiology. She also holds a music degree from Mieczyslaw Karlowicz Music School in Warsaw, Department of Piano Play. She is an avid salsa dancer performing with the Ache Performance Group. Her other hobbies is painting and photography.
Series This talk is part of the Machine Learning @ CUED series.
Included in Lists
- All Talks (aka the CURE list)
- Biology
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge Neuroscience Seminars
- Cambridge talks
- CBL important
- Chris Davis' list
- Creating transparent intact animal organs for high-resolution 3D deep-tissue imaging
- dh539
- dh539
- Engineering Department, CBL Room BE-438
- Featured lists
- Guy Emerson's list
- Hanchen DaDaDash
- Inference Group Summary
- Information Engineering Division seminar list
- Interested Talks
- Joint Machine Learning Seminars
- Life Science
- Life Sciences
- Machine Learning @ CUED
- Machine Learning Summary
- ML
- ndk22's list
- Neuroscience
- Neuroscience Seminars
- Neuroscience Seminars
- ob366-ai4er
- Required lists for MLG
- rp587
- Seminar
- Simon Baker's List
- Stem Cells & Regenerative Medicine
- Trust & Technology Initiative - interesting events
- yk373's list
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)



Wednesday 01 July 2015, 11:00-12:00