BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Towards a Theoretical Understanding of Deep Learning via the Minim
 um Description Length Principle - Dr Yoshinari Takeishi\, Kyushu Universit
 y
DTSTART:20260204T140000Z
DTEND:20260204T150000Z
UID:TALK239473@talks.cam.ac.uk
CONTACT:Prof. Ramji Venkataramanan
DESCRIPTION:Deep learning is a core machine learning technology that has d
 riven the rapid improvement and broad adoption of artificial intelligence 
 in recent years. It is based on learning with multilayer neural networks\,
  and models with massive numbers of parameters\, most notably large langua
 ge models (LLMs)\, have shown remarkable performance. However\, the theore
 tical foundations for why such large-scale models can be trained successfu
 lly and achieve high generalization performance are still incomplete\, and
  many researchers are actively working on this problem. In particular\, th
 ere is a gap between the insight from classical information criteria such 
 as AIC and MDL\, which suggests that preventing overfitting requires selec
 ting a model of an appropriate size\, and the empirical success of modern 
 deep learning. Bridging this gap is an important challenge.\n\nIn this tal
 k\, we tackle these theoretical challenges in deep learning from the viewp
 oint of the Minimum Description Length (MDL) principle. We first focus on 
 a simple two-layer neural network and present how one can obtain performan
 ce guarantees for an MDL estimator by leveraging a distinctive eigenvalue 
 structure of the Fisher information matrix that we have recently identifie
 d. We then discuss prospects for extending this approach to more complex d
 eep neural networks.\n\n(The previous talk on 28 January will be provide u
 seful background\, but this talk will be self-contained.)
LOCATION:MR5\, CMS Pavilion A
END:VEVENT
END:VCALENDAR
