Adaptive two-sample testing
- đ¤ Speaker: Arthur Gretton (Gatsby Computational Neuroscience Unit, UCL)
- đ Date & Time: Friday 27 October 2023, 14:00 - 15:00
- đ Venue: MR12, Centre for Mathematical Sciences
Questions? Contact
Qingyuan Zhao
Abstract
I will address the problem of two-sample testing using the Maximum Mean Discrepancy (MMD). The MMD is an integral probability metric defined using a reproducing kernel Hilbert space (RKHS), with properties determined by the choice of kernel. For good test power, the kernel must be chosen in accordance with the properties of the distributions being compared. I will address two cases:- The distributions being tested have densities, and the difference in densities lies in a Sobolev ball. The MMD test is then minimax optimal with a specific kernel depending on the smoothness parameter of the Sobolev ball. In practice, this parameter is unknown: to overcome this issue, I describe an aggregated test, called MMD Agg, which is adaptive to the smoothness parameter. The test power is maximised over the collection of kernels used, without requiring held-out data for kernel selection (which results in a loss of test power). MMD Agg controls the test level non-asymptotically, and achieves the minimax rate over Sobolev balls, up to an iterated logarithmic term. Guarantees hold for any product of one-dimensional translation invariant characteristic kernels.
- The distributions being tested may not have densities, but might be high dimensional (eg distributions over images), In this case, I will describe a heuristic for training neural net features for two-sample testing, by maximizing a proxy for test power over a held-out data set. This yields state-of-the-art performance on challenging real-world problems, for instance distinguishing between distributions over CIFAR images.
Series This talk is part of the Statistics series.
Included in Lists
- All CMS events
- All Talks (aka the CURE list)
- bld31
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- CMS Events
- custom
- DPMMS info aggregator
- DPMMS lists
- DPMMS Lists
- Guy Emerson's list
- Hanchen DaDaDash
- Interested Talks
- Machine Learning
- MR12, Centre for Mathematical Sciences
- rp587
- School of Physical Sciences
- Statistical Laboratory info aggregator
- Statistics
- Statistics Group
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Friday 27 October 2023, 14:00-15:00