BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Generating Natural-Language Video Descriptions using LSTM Recurren
 t Neural Networks - Raymond Mooney\, University of Texas
DTSTART:20160518T150000Z
DTEND:20160518T160000Z
UID:TALK65183@talks.cam.ac.uk
CONTACT:Kris Cao
DESCRIPTION:We present a method for automatically generating English sente
 nces describing short videos using deep neural networks. Specifically\, we
  apply convolutional and Long Short-Term Memory (LSTM) recurrent networks 
 to translate videos to English descriptions using an encoder/decoder frame
 work.  A sequence of image frames (represented using deep visual features)
  is first mapped to a vector encoding the full video\, and then this encod
 ing is mapped to a sequence of words. We have also explored how statistica
 l linguistic knowledge mined from large text corpora\, specifically LSTM l
 anguage models and lexical embeddings\, can improve the descriptions. Expe
 rimental evaluation on a corpus of short  YouTube videos and movie clips a
 nnotated by Descriptive Video Service demonstrate the capabilities of the 
 technique by comparing its output to human-generated descriptions.
LOCATION:FW26\, Computer Laboratory
END:VEVENT
END:VCALENDAR
