BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Mitigating Gender Bias in Morphologically Rich Languages - Ryan Co
 tterell\, NLIP\, University of Cambridge
DTSTART:20190524T110000Z
DTEND:20190524T120000Z
UID:TALK120274@talks.cam.ac.uk
CONTACT:Andrew Caines
DESCRIPTION:Gender bias exists in corpora of all of the world's languages:
  the bias is a function what people talk about\, not of the grammar of a l
 anguage. For this reason\, data-driven systems in NLP that are trained on 
 this data will inherit such bias. Evidence of bias can be found in all sor
 ts of NLP technologies: word vectors\, language models\, coreference syste
 ms and even machine translation. Most of the research done to mitigate gen
 der bias in natural language corpora\, however\, has focused solely on Eng
 lish. For instance\, in an attempt to remove gender bias in English corpor
 a\, NLP practitioners often augment corpora by swapping gendered words: i.
 e.\, if "he is a smart doctor" appears\, add the sentence "she is a smart 
 doctor" to the corpus as well before training a model. The broader researc
 h question asked in this talk is the following: How can we mitigate gender
  bias in corpora from any of the world's languages\, not just in English? 
 As an example\, the simple swapping heuristic for English will not general
 ize to most of the world's languages. Indeed\, such a solution would not e
 ven apply to German\, since it marks gender on both nouns and adjectives a
 nd requires gender agreement throughout a sentence. In the context of Germ
 an\, this task is far more complicated: mapping "er ist ein kluger Arzt" t
 o "sie ist eine kluge Ärztin" requires more than simply swapping "er" wit
 h "sie" and "Arzt" with "Ärztin"—one also has to modify the article ("e
 in") and the adjective ("klug"). In this talk\, we present a machine-learn
 ing solution to this problem: we develop a novel neural random field that 
 generates such sentence-to-sentence transformations\, enforcing agreement 
 with respect to gender. We explain how to perform inference and morphologi
 cal reinflection to generate such transformations _without any_ labeled tr
 aining examples. Empirically\, we illustrate that the model manages to red
 uce gender bias in corpora without sacrificing grammaticality with a novel
  metric of gender bias.  Additionally\, we discuss concrete applications t
 o coreference resolution and machine translation. 
LOCATION:FW26\, Computer Laboratory
END:VEVENT
END:VCALENDAR
