BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Preference Alignment\, with Reference Mismatch\, and without Refer
 ence Models - James Thorne (KAIST)
DTSTART:20250131T120000Z
DTEND:20250131T130000Z
UID:TALK225853@talks.cam.ac.uk
CONTACT:Suchir Salhan
DESCRIPTION:Abstract: In this talk\, I'll cover two recent papers for pref
 erence alignment: Odds-Ratio Preference Optimisation (ORPO\, EMNLP 2024)\,
  discussing the role of the reference model for preference alignment (e.g.
  DPO\, RLHF)\, and Margin-aware Preference Optimization (under review @ CV
 PR)\, thinking about the risks of reference mismatch: where the preference
  alignment data has features diverging from the reference model.\n\nBio:
  James is Assistant Professor at the KAIST Graduate School of AI\, South
  Korea\, working on large-scale and knowledge-intensive natural language u
 nderstanding. James recently completed his PhD at the University of Cambri
 dge where he developed models and methods for automated fact verification 
 and correction.\n\n[1] https://aclanthology.org/2024.emnlp-main.626/\n[2] 
 https://arxiv.org/pdf/2406.06424
LOCATION:Room SS03 with Hybrid Format. Here is the Zoom link for those tha
 t wish to join online: https://cam-ac-uk.zoom.us/j/4751389294?pwd=Z2ZOSDk0
 eG1wZldVWG1GVVhrTzFIZz09
END:VEVENT
END:VCALENDAR
