BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:How to climb a billion dimensional hill using a coin and a compass
  and count the steps before departure - Peter Richtarik (University of Edi
 nburgh)
DTSTART:20120503T140000Z
DTEND:20120503T150000Z
UID:TALK36496@talks.cam.ac.uk
CONTACT:Carola-Bibiane Schoenlieb
DESCRIPTION:SUBTITLE: Parallel block coordinate descent methods for huge-s
 cale partially separable problems\n\nWith growing digitization of the worl
 d it is increasingly easier to collect monstrous amounts of data. Often\, 
 this data is analyzed using an optimization algorithm\, and leads to diffi
 cult huge-scale problems with millions or billions of variables.\n\nExisti
 ng optimization algorithms\, which are perfectly suited for solving proble
 ms of medium size\, such as polynomial-time interior-point methods\, are o
 ften not useful in this new setting due to the bad dependence of their com
 plexity on the problem dimension. Hence\, there is a pressing need to devi
 se and analyze new methods\, or adapt classical methods\, that would be ca
 pable of working in the emerging huge-scale setting. An entirely different
  approach to the scale problem is via acceleration of existing methods on 
 parallel computing architectures such as many-core computers and clusters 
 thereof\, or systems based on graphical  processing units (GPUs).\n\nIn th
 is talk we describe a new method that combines the two approaches outlined
  above. Our method has both i) a good dependence on the problem dimension 
 and b) is parallel in nature\, and hence is well-suited for solving certai
 n structured optimization problems of huge dimension. In particular\, we s
 how that randomized block coordinate descent methods\, such as those devel
 oped in [1\, 2]\, can be accelerated by parallelization when applied to th
 e problem of minimizing the sum of a partially block separable smooth conv
 ex function and a simple block separable convex function.\n\nMany problems
  of current interest in diverse communities (statistics\, optimal experime
 ntal design\, machine learning\, mechanical engineering)\, can be cast in 
 this form\, including least-squares\, L1 regularized least-squares (LASSO)
 \, group and sparse group LASSO\, computing c and A-optimal designs of sta
 tistical experiments\, training (sparse) linear support vector machines (S
 VM) and truss topology design (TTD).\n\nWe describe a generic parallel ran
 domized block coordinate descent algorithm (PR-BCD) and several variants t
 hereof based on the way parallelization is performed. In all cases we prov
 e iteration complexity results\, i.e.\, we give bounds on the number of it
 erations sufficient to approximately solve the problem with high probabili
 ty. Our results generalize the intuitive observation that in the separable
  case the theoretical speedup caused by parallelization should be equal to
  the number of processors. We show that the speedup increases with the num
 ber of processors and with the degree of partial separability of the smoot
 h component of the objective function. Our analysis also works in the mode
  when the number of blocks being updated at each iteration is random\, whi
 ch allows for modelling situations with variable (busy or unreliable) numb
 er of processors.\n\nWe conclude with some encouraging computational resul
 ts applied to huge-scale LASSO\, SVM and TTD instances.\n\nAll results are
  based on joint work with Martin Takac (Edinburgh)\n\n\n[1] Yurii Nesterov
 . Efficiency of coordinate descent methods on huge-scale optimization prob
 lems. CORE Discussion Paper #2010/2\, February 2010.\n\n[2] Peter Richtari
 k and Martin Takac. Iteration complexity of randomized block-coordinate de
 scent methods for minimizing a composite function. After 1st revision in M
 athematical Programming A\, 2011.
LOCATION:MR14\, CMS
END:VEVENT
END:VCALENDAR
