BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Revisiting Cross-Lingual Transfer Learning - Mikel Artetxe\, Meta
DTSTART:20221201T110000Z
DTEND:20221201T120000Z
UID:TALK193441@talks.cam.ac.uk
CONTACT:Panagiotis Fytas
DESCRIPTION:Given downstream training data in one language (typically Engl
 ish)\, the goal of cross-lingual transfer learning is to perform the task 
 in another language. Existing approaches have been broadly classified into
  3 categories: zero-shot (fine-tune a multilingual language model in Engli
 sh and zero-shot transfer into the target language)\, translate-train (tra
 nslate the training data into the target language through MT and fine-tune
  a multilingual language model)\, and translate-test (translate the evalua
 tion data into English through MT and use an English model). Prior work mo
 stly finds that translate-train performs best followed by zero-shot and tr
 anslate-test\, and focuses on improving multilingual models. In this 3-par
 t talk\, we will revisit some of the fundamentals of this problem\, challe
 nging the conventional wisdom in the area. First\, we will see that a larg
 e part of the improvements from using parallel data can be attributed to e
 xplicitly modeling parallel interactions\, and similar improvements can be
  obtained using synthetic data. Second\, we will revisit the integration o
 f MT into the pipeline\, showing that the potential of translate-test has 
 been largely underestimated. Finally\, we will see how creating multilingu
 al benchmarks through translation\, as it is commonly done\, can result in
  evaluation artifacts\, which calls to reconsider some prior findings.
LOCATION:GR04\, English Faculty Building\, 9 West Road\, Sidgwick Site
END:VEVENT
END:VCALENDAR