Multilingual Models for Distributed Semantics
- đ¤ Speaker: Karl Moritz Hermann, Oxford University
- đ Date & Time: Friday 06 June 2014, 12:00 - 13:00
- đ Venue: FW26, Computer Laboratory
Abstract
In this talk I will present a technique for learning semantic representations, which extends the distributional hypothesis to multilingual data and joint-space embeddings. These models leverage parallel data and learn to strongly align the embeddings of semantically equivalent sentences, while maintaining sufficient distance between those of dissimilar sentences, using a form of noise-contrastive update.
A nice feature of these models is that they do not rely on word alignments or any syntactic information, making them easy to apply to a large number of diverse languages. I will briefly also describe an extension of this approach to learn semantic representations at the document level.
The talk will conclude with an analysis of these models and some empirical evaluation. Using several cross-lingual document classification tasks, I show that this approach can be used to learn semantically plausible, multilingual distributed representations.
Series This talk is part of the NLIP Seminar Series series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- Computer Education Research
- Computing Education Research
- Department of Computer Science and Technology talks and seminars
- FW26, Computer Laboratory
- Graduate-Seminars
- Guy Emerson's list
- Interested Talks
- Language Sciences for Graduate Students
- ndk22's list
- NLIP Seminar Series
- ob366-ai4er
- PMRFPS's
- rp587
- School of Technology
- Simon Baker's List
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Friday 06 June 2014, 12:00-13:00