BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Machine Learning is Linear Algebra - Andrew Gordon Wilson -  New Y
 ork University
DTSTART:20250213T160000Z
DTEND:20250213T170000Z
UID:TALK228163@talks.cam.ac.uk
CONTACT:123034
DESCRIPTION:I will talk about how modelling assumptions manifest themselve
 s as algebraic structure in a variety of settings\, including optimization
 \, attention\, and network parameters\, and how we can algorithmically exp
 loit that structure for better scaling laws with transformers. As part of 
 this effort\, I will present a unifying framework that enables searching a
 mong all linear operators expressible via an Einstein summation. This fram
 ework encompasses previously proposed structures\, such as low-rank\, Kron
 ecker\, Tensor-Train\, and Monarch\, along with many novel structures. We 
 develop a taxonomy of all such operators based on their computational and 
 algebraic properties\, which provides insights into their compute-optimal 
 scaling laws. Combining these insights with empirical evaluation\, we iden
 tify a subset of structures that achieve better performance than dense lay
 ers as a function of training compute\, which we then develop into a high-
 performance sparse mixture-of-experts layer.
LOCATION:https://cam-ac-uk.zoom.us/j/81897609356?pwd=HqbUQWnASjpBBZdaZo9r4
 3M9Gj4N3Q.1
END:VEVENT
END:VCALENDAR
