Monolingual and multilingual, explicit and latent vector representations of meaning
- 👤 Speaker: Roberto Navigli (Sapienza University of Rome) 🔗 Website
- 📅 Date & Time: Thursday 10 November 2016, 11:00 - 12:00
- 📍 Venue: GR05, English Faculty, 9 West Road (Sidgwick Site)
Abstract
In this talk I will present different kinds of representation of word senses and concepts. I will start with latent representations obtained as sense embeddings from the application of word2vec to the Wikipedia corpus, sense-tagged with a multilingual disambiguation algorithm based on BabelNet, the largest multilingual semantic network and encyclopedic dictionary covering 14 million concepts and entities and 271 languages.
I will then move on to two explicit vector representations of meaning (NASARI), based on lexical co-occurrence and multilingual semantic generalization, respectively, and a third latent version obtained from the word embeddings of the lexical vector.
Experimental results in several tasks, including word similarity, sense clustering, identification of sense predominance, and word sense disambiguation highlight high performance and show that, whenever a comparison is possible, sense representations consistently outperform word representations.
This is joint work with José Camacho-Collados, Ignacio Iacobacci and Mohammad Taher Pilehvar.
Series This talk is part of the Language Technology Lab Seminars series.
Included in Lists
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- GR05, English Faculty, 9 West Road (Sidgwick Site)
- Guy Emerson's list
- Interested Talks
- Language Sciences for Graduate Students
- Language Technology Lab Seminars
- ndk22's list
- ob366-ai4er
- rp587
- Simon Baker's List
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)



Thursday 10 November 2016, 11:00-12:00