The interpretability wars
- đ¤ Speaker: Rachel C. Zhang (DAMTP PhD)
- đ Date & Time: Wednesday 04 February 2026, 11:00 - 12:00
- đ Venue: MR10, Centre for Mathematical Sciences
Abstract
As AI systems become increasingly deployed in science and high-risk domains, interpretability has emerged as a critical concern. But what does “interpretability” actually mean, and is it even achievable? This journal club examines five recent papers + articles that offer different answers to these questions.
The journal club will discuss:
Barbiero et al., 2025: Foundations of Interpretable Models https://arxiv.org/pdf/2508.00545
Rowan et al., 2025: On the Definition and Importance of Interpretability in Scientific Machine Learning https://arxiv.org/pdf/2505.13510
Meloux et al., 2025: The Dead Salmons of AI Interpretability https://arxiv.org/pdf/2512.18792
Rudin 2019: Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead https://arxiv.org/pdf/1811.10154
And an article by Hendrycks + Hiscott in 2025: The Misguided Quest for Mechanistic AI Interpretability https://ai-frontiers.org/articles/the-misguided-quest-for-mechanistic-ai-interpretability
Series This talk is part of the DAMTP ML for Science Reading Group series.
Included in Lists
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Wednesday 04 February 2026, 11:00-12:00