University of Cambridge > Talks.cam > DAMTP ML for Science Reading Group > How Do Language Models Reason and Compute? A Mechanistic Interpretability Approach

Log in

Google

Microsoft

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

How Do Language Models Reason and Compute? A Mechanistic Interpretability Approach

Download to your calendar using vCal

Julia Dima (DIS MPhil)
Wednesday 11 March 2026, 11:00-12:00
MR10, Centre for Mathematical Sciences.

If you have a question about this talk, please contact Liz Tan .

Mechanistic interpretability aims to uncover the internal algorithms implemented by neural networks by identifying the circuits responsible for specific behaviours.

In this talk, we introduce the goals and methods of mechanistic interpretability for LLMs, including recent approaches based on sparse feature decompositions, circuit analysis, and attribution graphs. We discuss how these tools can help better understand the internal mechanisms behind specific model behaviours, such as reasoning or arithmetic, and the importance of these mechanisms for scientific insight into LLMs.

We will base our discussion on a medium-scale language model (Qwen3-4B) and build on ideas from On the Biology of a Large Language Model (Anthropic, 2025).

This talk is part of the DAMTP ML for Science Reading Group series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

How Do Language Models Reason and Compute? A Mechanistic Interpretability Approach

📅 Download to calendar (vCal)

👤 Speaker: Julia Dima (DIS MPhil)
📅 Date & Time: Wednesday 11 March 2026, 11:00 - 12:00
📍 Venue: MR10, Centre for Mathematical Sciences

Questions? Contact Liz Tan

Abstract

Mechanistic interpretability aims to uncover the internal algorithms implemented by neural networks by identifying the circuits responsible for specific behaviours.

We will base our discussion on a medium-scale language model (Qwen3-4B) and build on ideas from On the Biology of a Large Language Model (Anthropic, 2025).

Series This talk is part of the DAMTP ML for Science Reading Group series.

Included in Lists

Note: Ex-directory lists are not shown.

Log in

🔐 Log In

Information on

ℹ️ Information

How Do Language Models Reason and Compute? A Mechanistic Interpretability Approach

This talk is included in these lists:

How Do Language Models Reason and Compute? A Mechanistic Interpretability Approach

Abstract

Included in Lists

Log in

🔐 Log In

Information on

ℹ️ Information

How Do Language Models Reason and Compute? A Mechanistic Interpretability Approach

This talk is included in these lists:

Other lists

Other talks

How Do Language Models Reason and Compute? A Mechanistic Interpretability Approach

Abstract

Included in Lists