BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:NAACL practice talks - Simon Baker (LTL) &amp\; Marek Rei (NLIP)\,
  University of Cambridge
DTSTART:20180525T110000Z
DTEND:20180525T120000Z
UID:TALK104650@talks.cam.ac.uk
CONTACT:Andrew Caines
DESCRIPTION:*Zero-shot Sequence Labeling: Transferring Knowledge from Sent
 ences to Tokens*\n\nMarek Rei & Anders Søgaard\n\nCan attention- or gradi
 ent-based visualization techniques be used to infer token-level labels for
  binary sequence tagging problems\, using networks trained only on sentenc
 e-level labels?\nWe construct a neural network architecture based on soft 
 attention\, train it as a binary sentence classifier and evaluate against 
 token-level annotation on four different datasets. Inferring token labels 
 from a network provides a method for quantitatively evaluating what the mo
 del is learning\, along with generating useful feedback in assistance syst
 ems.\nOur results indicate that attention-based methods are able to predic
 t token-level labels more accurately\, compared to gradient-based methods\
 , sometimes even rivaling the supervised oracle network. \n\n*Variable Typ
 ing: Assigning Meaning to Variables in Mathematical Text*\n\nYiannos A. St
 athopoulos\, Simon Baker\, Marek Rei & Simone Teufel\n\nInformation about 
 the meaning of mathematical variables in text is useful in NLP/IR tasks su
 ch as symbol disambiguation\, topic modeling and mathematical information 
 retrieval (MIR). We introduce variable typing\, the task of assigning one 
 mathematical type (multi-word technical terms referring to mathematical co
 ncepts) to each variable in a sentence of mathematical text. As part of th
 is work\, we also introduce a new annotated data set composed of 33\,524 d
 ata points extracted from scientific documents published on arXiv. Our int
 rinsic evaluation demonstrates that our data set is sufficient to successf
 ully train and evaluate current classifiers from three different model arc
 hitectures. The best performing model is evaluated on an extrinsic task: M
 IR\, by producing a typed formula index. Our results show that the best pe
 rforming MIR models make use of our typed index\, compared to a formula in
 dex only containing raw symbols\, thereby demonstrating the usefulness of 
 variable typing.
LOCATION:FW26\, Computer Laboratory
END:VEVENT
END:VCALENDAR
