BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:How Do Language Models Reason and Compute? A Mechanistic Interpret
 ability Approach - Julia Dima (DIS MPhil)
DTSTART:20260311T110000Z
DTEND:20260311T120000Z
UID:TALK245650@talks.cam.ac.uk
CONTACT:Liz Tan
DESCRIPTION:Mechanistic interpretability aims to uncover the internal algo
 rithms implemented by neural networks by identifying the circuits responsi
 ble for specific behaviours. \n\nIn this talk\, we introduce the goals and
  methods of mechanistic interpretability for LLMs\, including recent appro
 aches based on sparse feature decompositions\, circuit analysis\, and attr
 ibution graphs. We discuss how these tools can help better understand the 
 internal mechanisms behind specific model behaviours\, such as reasoning o
 r arithmetic\, and the importance of these mechanisms for scientific insig
 ht into LLMs. \n\nWe will base our discussion on a medium-scale language m
 odel (Qwen3-4B) and build on ideas from On the Biology of a Large Language
  Model (Anthropic\, 2025).
LOCATION:MR10\, Centre for Mathematical Sciences
END:VEVENT
END:VCALENDAR