University of Cambridge > Talks.cam > NLIP Seminar Series > Multilingual Image Description with Neural Sequence Models

Multilingual Image Description with Neural Sequence Models

Download to your calendar using vCal

If you have a question about this talk, please contact Kris Cao .

We introduce multilingual image description, the task of generating descriptions of images given data in multiple languages. This can be viewed as visually-grounded machine translation, allowing the image to play a role in disambiguating language. We present models for this task that are inspired by neural models for image description and machine translation. Our multilingual image description models generate target-language sentences using features transferred from separate models: multimodal features from a monolingual source-language image description model and visual features from an object recognition model. In experiments on a dataset of images paired with English and German sentences, using BLEU and Meteor as a metric, our models substantially improve upon existing monolingual image description models.

This talk is part of the NLIP Seminar Series series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

Š 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity