Geographically Grounded Language Models
- đ¤ Speaker: Speaker to be confirmed
- đ Date & Time: Thursday 01 February 2024, 11:00 - 12:00
- đ Venue: https://cam-ac-uk.zoom.us/j/97599459216?pwd=QTRsOWZCOXRTREVnbTJBdXVpOXFvdz09
Abstract
Textual data exhibit pronounced variation along geographical dimensions (e.g., due to dialect differences). Common practices of training and deploying language models do not take this inherent dynamicity into account, leading to detrimental effects for their robustness and performance on downstream tasks. In this talk, I will explore how some shortcomings of text-only NLP pipelines can be alleviated by grounding language models in geography. I will first give a brief overview of prior research on geographical variation in NLP . I will then present geoadaptation, a method for geographically grounding language models that combines language modeling with geolocation prediction in a multi-task learning setup. Geoadaptation leads to consistent performance improvements across a range of tasks and language areas, especially in zero-shot settings. Finally, I will show that the effectiveness of geoadaptation stems from its ability to geographically retrofit the representation space of language models.
Series This talk is part of the Language Technology Lab Seminars series.
Included in Lists
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- Guy Emerson's list
- https://cam-ac-uk.zoom.us/j/97599459216?pwd=QTRsOWZCOXRTREVnbTJBdXVpOXFvdz09
- Interested Talks
- Language Sciences for Graduate Students
- Language Technology Lab Seminars
- ndk22's list
- ob366-ai4er
- rp587
- Simon Baker's List
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Thursday 01 February 2024, 11:00-12:00