CBL Alumni Talk: Examining Critiques in Bayesian Deep Learning
- 👤 Speaker: Andrew Gordon Wilson
- 📅 Date & Time: Friday 16 April 2021, 16:00 - 17:00
- 📍 Venue: https://eng-cam.zoom.us/j/82969702755?pwd=L0dIVnlwSHJHV2NGbUQ1cmxpYjIyUT09
Abstract
Approximate inference procedures in Bayesian deep learning have become scalable and practical, often providing better accuracy and calibration than classical training, without significant computational overhead. However, there have emerged several challenges to the Bayesian approach in deep learning. It was found in an empirical study that deep ensembles, formed from re-training an architecture and ensembling the result, outperformed some approaches to approximate Bayesian inference—- which led to the question of whether we should pursue ensembling instead of Bayesian methods in deep learning. It was later observed that several approximate inference approaches appear to raise the posterior to a power 1/T, with T less than 1, leading to a “cold posterior”, which was asserted as being “sharply divergent” with Bayesian principles. In the same paper, the popular Gaussian priors we use in deep learning were questioned as unreasonable, supported by an experiment showing that each sample function from a prior appears to assign nearly all of CIFAR -10 to a particular class.
In this talk, we will examine these critiques, and show that (1) deep ensembles provide a better approximation of the Bayesian predictive distribution than the approximate inference procedures considered in the empirical study, and in general are a reasonable approach to approximate inference in deep learning under severe computational constraints; (2) tempering is in fact not typically required, and is also a reasonable procedure in general; (3) the example of prior functions assigning nearly all data to one class can be easily resolved by calibrating the signal variance of the Gaussian prior; (4) Gaussian priors, while imperfect like any prior, induce a prior over functions with many desirable properties when combined with a neural architecture.
A theme in this talk is that while we should be careful to scrutinize our modelling procedures, we should also apply the same critical scrutiny to the critiques, leading to a deeper and more nuanced understanding, and more successful practical innovations.
Sections 3.2, 3.3, 4-9 of https://arxiv.org/abs/2002.08791 provide good background reading for the talk.
Series This talk is part of the Machine Learning Reading Group @ CUED series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Cambridge University Engineering Department Talks
- Centre for Smart Infrastructure & Construction
- Chris Davis' list
- Computational Continuum Mechanics Group Seminars
- custom
- Featured lists
- Guy Emerson's list
- Hanchen DaDaDash
- https://eng-cam.zoom.us/j/82969702755?pwd=L0dIVnlwSHJHV2NGbUQ1cmxpYjIyUT09
- Inference Group Journal Clubs
- Inference Group Summary
- Information Engineering Division seminar list
- Interested Talks
- Machine Learning Reading Group
- Machine Learning Reading Group @ CUED
- Machine Learning Summary
- ML
- ndk22's list
- ob366-ai4er
- Quantum Matter Journal Club
- Required lists for MLG
- rp587
- School of Technology
- Simon Baker's List
- TQS Journal Clubs
- Trust & Technology Initiative - interesting events
- yk373's list
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Andrew Gordon Wilson
Friday 16 April 2021, 16:00-17:00