BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:An Introduction to Mechanistic Interpretability - Rachel C. Zhang 
 (DAMTP PhD)\, Liz Tan (DAMTP PhD)
DTSTART:20251028T153000Z
DTEND:20251028T163000Z
UID:TALK240046@talks.cam.ac.uk
CONTACT:Rachel C. Zhang
DESCRIPTION:In the first journal club\, we will be discussing mechanistic 
 interpretability for language models. The meeting will be structured as fo
 llows:\n\n- Overview of mechanistic interpretability and deep-dive into tr
 ansformer circuits (see ‘A Mathematical Framework of Transformer Circuit
 s’ Anthropic 2021: https://transformer-circuits.pub/2021/framework/index
 .html)\n\n- Discussion of recent paper studying how LLMs develop perceptua
 l abilities by investigating how Claude 3.5 Haiku learns to perform linebr
 eaking in fixed-width text (see ‘When Models Manipulate Manifolds: The G
 eometry of a Counting Task’ Anthropic 2025: https://transformer-circuits
 .pub/2025/linebreaks/index.html) \n\nIt is not necessary to read the above
  literature before the session!\n
LOCATION:B1.19 Potters Room\, Centre for Mathematical Sciences\, Cambridge
  CB3 0WA
END:VEVENT
END:VCALENDAR
