BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Randomized tree ensembles: output kernels and variable importances
  - Pierre Geurts\, University of Liège
DTSTART:20131126T100000Z
DTEND:20131126T110000Z
UID:TALK48349@talks.cam.ac.uk
CONTACT:Microsoft Research Cambridge Talks Admins
DESCRIPTION:Methods based on ensembles of randomized trees\, such as rando
 m forests and extremely randomized trees\, have been at the origin of many
  successful applications in various domains\, among which computer vision 
 and bioinformatics. The main advantages of these methods include statistic
 al and computational efficiencies\, ease of use\, flexibility\, and interp
 retability. This talk focuses on two methodological developments around th
 ese methods. First\, we will present a principled generalization of classi
 fication and regression trees to make predictions in a kernel-induced outp
 ut space. From a sample of both input feature vectors and a Gram matrix of
  output kernel values\, the resulting method\, called output kernel trees\
 , learns a model of an output kernel as a function of the input features. 
 This generalization naturally opens tree-based methods to structured outpu
 t prediction and supervised kernel learning. The practical interest of the
  method will be illustrated on the problem of supervised graph inference. 
 The second part of the talk will be devoted to variable importances derive
 d from ensembles of randomized trees. Despite growing interest and practic
 al use in various scientific areas\, these variable importances are not we
 ll understood from a theoretical point of view. In an attempt to fill this
  gap\, we will present a theoretical analysis of the mean decrease impurit
 y variable importances as measured by an ensemble of totally randomized tr
 ees in asymptotic conditions. In particular\, we demonstrate that the impo
 rtance of a variable is equal to zero if and only if the variable is irrel
 evant and that the importance of a relevant variable is invariant with res
 pect to the removal or the addition of irrelevant variables. These propert
 ies will be illustrated and we will discuss how they may change in the cas
 e of non-totally randomized trees such as random forests and extremely ran
 domized trees.
LOCATION:Auditorium\, Microsoft Research Ltd\, 21 Station Road\, Cambridge
 \, CB1 2FB
END:VEVENT
END:VCALENDAR
