University of Cambridge > Talks.cam > DAMTP ML for Science Reading Group > The interpretability wars

The interpretability wars

Download to your calendar using vCal

If you have a question about this talk, please contact Rachel Zhang .

As AI systems become increasingly deployed in science and high-risk domains, interpretability has emerged as a critical concern. But what does “interpretability” actually mean, and is it even achievable? This journal club examines five recent papers + articles that offer different answers to these questions.

The journal club will discuss:

Barbiero et al., 2025: Foundations of Interpretable Models https://arxiv.org/pdf/2508.00545

Rowan et al., 2025: On the Definition and Importance of Interpretability in Scientific Machine Learning https://arxiv.org/pdf/2505.13510

Meloux et al., 2025: The Dead Salmons of AI Interpretability https://arxiv.org/pdf/2512.18792

Rudin 2019: Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead https://arxiv.org/pdf/1811.10154

And an article by Hendrycks + Hiscott in 2025: The Misguided Quest for Mechanistic AI Interpretability https://ai-frontiers.org/articles/the-misguided-quest-for-mechanistic-ai-interpretability

This talk is part of the DAMTP ML for Science Reading Group series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

Š 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity