BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Linear Transformers for Efficient Sequence Modeling - Prof Yoon Ki
 m\, MIT
DTSTART:20250123T150000Z
DTEND:20250123T160000Z
UID:TALK226012@talks.cam.ac.uk
CONTACT:Shun Shao
DESCRIPTION:Abstract:\n\nTransformers are still the dominant architecture 
 for language modeling (and generative AI more broadly). The attention mech
 anism in Transformers is considered core to the architecture and enables a
 ccurate sequence modeling at scale. However\, attention requires explicitl
 y modeling pairwise interactions amongst all elements of a sequence\, and 
 thus its complexity is quadratic in input length. This talk will describe 
 some recent work from our group on efficient architectural alternatives to
  Transformers for language modeling\, in particular linear Transformers\, 
 which can be reparameterized as an RNN and thus allow for linear-time cons
 tant-memory sequence modeling. We also provide connections between linear 
 Transformers and recent state-space models such as Mamba.\n\nBio:\nYoon Ki
 m is an assistant professor at MIT (EECS/CSAIL). He obtained his PhD in co
 mputer science from Harvard University\, where he was advised by Alexander
  Rush. Prof. Kim works on natural language processing and machine learning
 . Current interests include:\n- Efficient training and deployment of large
 -scale models\n- Understanding the capabilities and limitations of languag
 e models\n- Symbolic mechanisms for controlling and augmenting neural netw
 orks
LOCATION:https://cam-ac-uk.zoom.us/j/97599459216?pwd=QTRsOWZCOXRTREVnbTJBd
 XVpOXFvdz09
END:VEVENT
END:VCALENDAR
