BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:LLMs and Low-Resource Languages - Eneko Agirre\, University of the
  Basque Country (UPV/EHU)
DTSTART:20241127T160000Z
DTEND:20241127T170000Z
UID:TALK224134@talks.cam.ac.uk
CONTACT:Tiancheng Hu
DESCRIPTION:Abstract: Generative AI models are now multilingual\, raising 
 new questions about their relative performance across languages and local 
 cultures\, specially for communities with less speakers. In this talk I wi
 ll explore some of those questions and the lessons we learned along the pr
 ocess. Is it possible to build high-performing LLMs for low-resource langu
 ages? We have built a high performing open model for Basque accompanied by
  a fully reproducible end-to-end evaluation suite. Do LLMs think better in
  English than the local language? Our experiments show that LLMs do not fu
 lly exploit their multilingual potential when prompted in non-English lang
 uages. Do LLMs know about local culture? We probed the complex interaction
  between language and global/local knowledge\, showing for the first time 
 that local knowledge is transferred from the low-resource to the high-reso
 urce language\, a sign that prior findings may not hold when evaluated on 
 local topics. The evaluation suite was recognised with a best resource pap
 er award at ACL 2024.\n\n\n\nBio: Eneko Agirre is Full Professor of Inform
 atics and Head of HiTZ Basque Center of Language Technology at the Univers
 ity of the Basque Country\, UPV/EHU\, in San Sebastian\, Spain.\nVisiting 
 researcher or professor at New Mexico State\, Melbourne\, Southern Califor
 nia\, Stanford and New York Universities. He has been active in Natural La
 nguage Processing and Computational Linguistics since his undergraduate da
 ys. He received the Spanish Informatics Research Award in 2021\, and is on
 e of the 74 fellows of the Association of Computational Linguistics (ACL).
  He was President of ACL's SIGLEX\, member of the editorial board of Compu
 tational Linguistics\, Journal of Artificial Intelligence Research and Act
 ion Editor for the Transactions of the ACL. He is co-founder of the Joint 
 Conference on Lexical and Computational Semantics (*SEM). He is a recipien
 t of three Google Research Awards and six best paper awards and nomination
 s\, most recent at ACL 2024. Dissertations under his supervision received 
 best PhD awards by EurAI\, the Spanish NLP society and the Spanish Informa
 tics Scientific Association. He has over 200 publications across a wide ra
 nge of NLP and AI topics\, as well as having given more than 20 invited ta
 lks\, mostly international.
LOCATION:GR04\, English Faculty Building\, 9 West Road\, Sidgwick Site and
  online https://cam-ac-uk.zoom.us/j/97599459216?pwd=QTRsOWZCOXRTREVnbTJBdX
 VpOXFvdz09
END:VEVENT
END:VCALENDAR
