BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Continuous Diffusion for Mixed-Type Tabular Data - Markus Mueller\
 , Erasmus University\, Rotterdam
DTSTART:20241106T173000Z
DTEND:20241106T183000Z
UID:TALK224287@talks.cam.ac.uk
CONTACT:Andreas Bedorf
DESCRIPTION:Score-based generative models (or diffusion models for short) 
 have proven successful for generating text and image data. However\, the a
 daption of this model family to tabular data of mixed-type has fallen shor
 t so far. We propose CDTD\, a Continuous Diffusion model for mixed-type Ta
 bular Data. Specifically\, we combine score matching and score interpolati
 on to ensure a common continuous noise distribution for both continuous an
 d categorical features alike. We counteract the high heterogeneity inheren
 t to data of mixed-type with distinct\, adaptive noise schedules per featu
 re or per data type. The learnable noise schedules ensure optimally alloca
 ted model capacity and balanced generative capability. We homogenize the d
 ata types further with model-specific loss calibration and initialization 
 schemes tailored to mixed-type tabular data. Our experimental results show
  that CDTD consistently outperforms state-of-the-art benchmark models\, ca
 ptures feature correlations exceptionally well\, and that heterogeneity in
  the noise schedule design boosts the sample quality.\n\nBio:\nMarkus Muel
 ler is a PhD candidate at the Econometric Institute of the Erasmus Univers
 ity Rotterdam. His research focuses on probabilistic machine learning\, wi
 th emphasis on its applications in the social and economic sciences. In pa
 rticular\, he is interested in the adaption of deep generative models to t
 abular data and their use cases\, for instance\, for causal inference or m
 issing value imputation.
LOCATION:DAMTP Pavillon A Room MR3 &amp\; https://cam-ac-uk.zoom.us/j/8178
 9586115?pwd=0muHnMU6aadOgNLIbJISnJ8JdY8yXI.1
END:VEVENT
END:VCALENDAR
