BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Texts Come from People - How Demographic Factors Influence NLP Mod
 els - Dirk Hovy\, University of Copenhagen
DTSTART:20160318T140000Z
DTEND:20160318T150000Z
UID:TALK64565@talks.cam.ac.uk
CONTACT:Kris Cao
DESCRIPTION:The way we express ourselves is heavily influenced by our demo
 graphic background. I.e.\, we don't expect teenagers to talk the same way 
 as retirees. Natural Language Processing (NLP) models\, however\, are base
 d on a small demographic sample and approach all language as uniform. As a
  result\, NLP models perform worse on language from demographic groups tha
 t differ from the training data\, i.e.\, they encode a demographic bias. T
 his bias harms performance and can disadvantage entire user groups.\n\nSoc
 iolinguistics has long investigated the interplay of demographic factors a
 nd language use\, and it seems likely that the same factors are also prese
 nt in the data we use to train NLP systems.\n\nIn this talk\, I will show 
 how we can combine statistical NLP methods and sociolinguistic theories to
  the benefit of both fields. I present ongoing research into large-scale s
 tatistical analysis of demographic language variation to detect factors th
 at influence the performance (and fairness) of NLP systems\, and how we ca
 n incorporate demographic information into statistical models to address b
 oth problems.
LOCATION:FW26\, Computer Laboratory
END:VEVENT
END:VCALENDAR
