BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:CBL Alumni Talk: Examining Critiques in Bayesian Deep Learning - A
 ndrew Gordon Wilson
DTSTART:20210416T150000Z
DTEND:20210416T160000Z
UID:TALK158818@talks.cam.ac.uk
CONTACT:Elre Oldewage
DESCRIPTION:Approximate inference procedures in Bayesian deep learning hav
 e become scalable and practical\, often providing better accuracy and cali
 bration than classical training\, without significant computational overhe
 ad. However\, there have emerged several challenges to the Bayesian approa
 ch in deep learning. It was found in an empirical study that deep ensemble
 s\, formed from re-training an architecture and ensembling the result\, ou
 tperformed some approaches to approximate Bayesian inference --- which led
  to the question of whether we should pursue ensembling instead of Bayesia
 n methods in deep learning. It was later observed that several approximate
  inference approaches appear to raise the posterior to a power 1/T\, with 
 T less than 1\, leading to a “cold posterior”\, which was asserted as 
 being "sharply divergent" with Bayesian principles. In the same paper\, th
 e popular Gaussian priors we use in deep learning were questioned as unrea
 sonable\, supported by an experiment showing that each sample function fro
 m a prior appears to assign nearly all of CIFAR-10 to a particular class.\
 n\nIn this talk\, we will examine these critiques\, and show that (1) deep
  ensembles provide a better approximation of the Bayesian predictive distr
 ibution than the approximate inference procedures considered in the empiri
 cal study\, and in general are a reasonable approach to approximate infere
 nce in deep learning under severe computational constraints\; (2) temperin
 g is in fact not typically required\, and is also a reasonable procedure i
 n general\; (3) the example of prior functions assigning nearly all data t
 o one class can be easily resolved by calibrating the signal variance of t
 he Gaussian prior\; (4) Gaussian priors\, while imperfect like any prior\,
  induce a prior over functions with many desirable properties when combine
 d with a neural architecture.\n\nA theme in this talk is that while we sho
 uld be careful to scrutinize our modelling procedures\, we should also app
 ly the same critical scrutiny to the critiques\, leading to a deeper and m
 ore nuanced understanding\, and more successful practical innovations.\n\n
 Sections 3.2\, 3.3\, 4-9 of https://arxiv.org/abs/2002.08791 provide good 
 background reading for the talk.
LOCATION: https://eng-cam.zoom.us/j/82969702755?pwd=L0dIVnlwSHJHV2NGbUQ1cm
 xpYjIyUT09
END:VEVENT
END:VCALENDAR
