BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY: Learning to Create and Reuse Words in Open-Vocabulary Language Mo
 deling - Kazuya Kawakami (DeepMind)
DTSTART:20171123T110000Z
DTEND:20171123T120000Z
UID:TALK95824@talks.cam.ac.uk
CONTACT:Dimitri Kartsaklis
DESCRIPTION:Fixed-vocabulary language models fail to account for one of th
 e most\ncharacteristic statistical facts of natural language: the frequent
  creation and reuse of new word types. Although character-level language m
 odels offer a partial solution in that they can create word types not atte
 sted in the training corpus\, they do not capture the “bursty” distrib
 ution of such words. In this talk\, we discuss a hierarchical LSTM languag
 e model that generates sequences of word\ntokens character by character wi
 th a caching mechanism that learns to reuse previously generated words. To
  validate our model we construct a new open-vocabulary language modeling c
 orpus (the Multilingual Wikipedia Corpus\; MWC) from comparable Wikipedia 
 articles in 7 typologically diverse languages and demonstrate the effectiv
 eness of our model across this range of languages.
LOCATION: SR-24\, English Faculty Building\, 9 West Road (Sidgwick Site)
END:VEVENT
END:VCALENDAR
