BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Machine Translation with LSTMs - Ilya Sutskever (Google)
DTSTART:20141128T103000Z
DTEND:20141128T113000Z
UID:TALK56309@talks.cam.ac.uk
CONTACT:Dr Jes Frellsen
DESCRIPTION:Deep Neural Networks (DNNs) are powerful models that have achi
 eved excellent performance on difficult learning tasks. Although DNNs work
  well whenever large labeled training sets are available\, they cannot be 
 used to map sequences to sequences. In this talk\, I will present a genera
 l end-to-end approach to sequence learning that makes minimal assumptions 
 on the sequence structure. The method uses a multilayered Long Short-Term 
 Memory (LSTM) to map the input sequence to a vector of a fixed dimensional
 ity\, and then another deep LSTM to decode the target sequence from the ve
 ctor. The main result is that on an English to French translation task fro
 m the WMT-14 dataset\, the translations produced by the LSTM achieve a BLE
 U score of 34.8 on the entire test set\, where the LSTM’s BLEU score is 
 penalized on out-of-vocabulary words.  While this performance is respectab
 le\, it is worse than state of the art performance on this dataset (which 
 is 37.0) mainly due to the LSTM's inability to translate out-of-vocabulary
  (OOV) words.  In the second half of the talk\, I will present a simple me
 thod for addressing the OOV problem.  The method consists of annotating ea
 ch OOV word in the training set with a "pointer" to its origin in the sour
 ce sentence\, which makes it easy to translate the OOV words at test time 
 using a dictionary.  The new method achieves a BELU score of 37.5\, which 
 is a new state-of-the-art.\n\nThis is joint work with Thang Luong\, Oriol 
 Vinyals\, Quoc Le\, an Wojciech Zaremba.
LOCATION:Engineering Department\, LR3B
END:VEVENT
END:VCALENDAR
