BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Doubt thy models: rethinking hypothesis testing in NLP - Haim Dubo
 ssarsky (University of Cambridge)
DTSTART:20200131T120000Z
DTEND:20200131T130000Z
UID:TALK128893@talks.cam.ac.uk
CONTACT:James Thorne
DESCRIPTION:Recent years have seen the rise of machine learning models in 
 NLP research\, which are applied inter alia\, to research on questions mot
 ivated by linguistic theory. Indeed\, it has now become relatively easy to
  model and to test research problems. The ease with which models can be de
 ployed comes at the risk of careless use\, which may potentially lead to u
 nreliable findings and ultimately even hinder our ability to extend our kn
 owledge. Such misuse may stem\, for example\, from unfamiliarity with the 
 assumptions and hypotheses that are implicit to the models\, or inherent c
 onfounds that demand experimental controls.\nIn this talk\, I will focus o
 n problems that are specific to linguistically-motivated questions (e.g.\,
  semantic change)\, but also to classical NLP research more generally\, (e
 .g.\, polysemy resolution and representation)\, where word embeddings are 
 the prominent ML models. Major problems include biases induced by word fre
 quency\, similarity estimation of noisy word vector representations\, and 
 the evaluation of models’ performance in the absence of properly validat
 ed evaluation tasks in general. I will suggest ways to mitigate some of th
 ese problems\, and share some ideas about performing valid scientific rese
 arch in the age of all-to-easy modeling.\n
LOCATION:SS03\, Computer Laboratory
END:VEVENT
END:VCALENDAR