BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Machine Learning on Tabular Data - Arik Reuter and John Bronskill 
 (University of Cambridge)
DTSTART:20260311T110000Z
DTEND:20260311T123000Z
UID:TALK245587@talks.cam.ac.uk
CONTACT:Xianda Sun
DESCRIPTION:Data in tabular form is ubiquitous in industries such as finan
 ce\, healthcare\, and education. Historically\, boosted decision trees and
  multi-layer perceptrons were the models of choice for making predictions 
 on tabular data. In the last couple of years\, neural process-based transf
 ormers (e.g. TabPFN\, TabICL) that are trained on synthetic data have surp
 assed traditional approaches in terms of speed and accuracy. Well-funded s
 tart-ups have recently exploited these advances offering easy to use tools
  aimed at enterprises. In this talk we will 1) introduce the basic concept
 s behind tabular machine learning\; 2) describe traditional tabular learni
 ng approaches including XGBoost\, CatBoost\, and RealMLP\; 3) do deep dive
 s on TabPFN and TabICL\; 4) examine the use of LLMs for making tabular pre
 dictions\; and finally 5) discuss some recent work on casual tabular found
 ation models.
LOCATION:Cambridge University Engineering Department\, CBL Seminar room BE
 4-38.
END:VEVENT
END:VCALENDAR
