BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Latent Concepts in Large Language Models - Prof. Pradeep Ravikumar
 \, Carnegie Mellon University
DTSTART:20250610T130000Z
DTEND:20250610T140000Z
UID:TALK232738@talks.cam.ac.uk
CONTACT:Prof. Ramji Venkataramanan
DESCRIPTION:Large Language Models (LLMs) have achieved remarkable fluency 
 and versatility -- but understanding how they represent meaning internally
  remains a challenge. In this talk\, we explore the emerging science of la
 tent concepts in LLMs: the semantic abstractions implicitly encoded in the
 ir internal activations.\n\nWe examine how concepts -- such as truthfulnes
 s\, formality\, or sentiment -- can be represented as low-dimensional stru
 ctures\, discovered through training dynamics\, and understood through the
  lens of linear algebra and associative memory. We discuss the implication
 s for interpretability\, robustness\, and control\, including how concepts
  can be steered at test time to adjust model behavior without retraining. 
 Specifically\, we explore empirical and theoretical evidence supporting th
 e linear representation hypothesis\, where such concepts correspond to vec
 tors or affine subspaces\, emerging naturally from training dynamics and n
 ext-token prediction objectives. We further show that LLMs behave as assoc
 iative memory systems\, retrieving outputs based on latent similarity rath
 er than logical inference. This behavior underlies phenomena such as conte
 xt hijacking\, where semantically misleading prompts can bias the model’
 s response.\n\nWe introduce formal latent concept models that unify these 
 ideas\, describe conditions under which concepts are identifiable\, and pr
 opose learning algorithms for extracting interpretable\, controllable repr
 esentations. We argue that such latent concept modeling offers a principle
 d framework for bridging representation learning with interpretability and
  model alignment\, and offers a promising path toward safer\, more control
 lable\, and more trustworthy AI.
LOCATION:JDB Seminar Room\, CUED
END:VEVENT
END:VCALENDAR
