Tensor product representations for RNNs / Revisiting post-processing for word embeddings
- π€ Speaker: Shuai Tang, University of California, San Diego
- π Date & Time: Thursday 17 October 2019, 11:00 - 12:00
- π Venue: Board room, Faculty of English, 9 West Rd (Sidgwick Site)
Abstract
1/ Tensor product representations for RNNs
Widely used recurrent units, including Long-short Term Memory (LSTM) and the Gated Recurrent Unit (GRU), perform well on natural language tasks, but their ability to learn structured representations is still questionable. Exploiting reduced Tensor Product Representations (TPRs)
— distributed representations of symbolic structure in which vector-embedded symbols are bound to vector-embedded structural positions— we propose the TPRU , a simple recurrent unit that, at each time step, explicitly executes structural-role binding and unbinding operations to incorporate structural information into learning. The gradient analysis of our proposed TPRU is conducted to support our model design, and its performance on multiple datasets shows the effectiveness of our design choices. Furthermore, observations on linguistically grounded study demonstrate the interpretability of our TPRU .
2/ Revisiting post-processing for word embeddings
Word embeddings learnt from large corpora have been adopted in various applications in natural language processing and served as the general input representations to learning systems. Recently, a series of post-processing methods have been proposed to boost the performance of word embeddings on similarity comparison and analogy retrieval tasks, and some have been adapted to compose sentence representations. The general hypothesis behind these methods is that by enforcing the embedding space to be more isotropic, the similarity between words can be better expressed. We view these methods as an approach to shrink the covariance/Gram matrix, which is estimated by learning word vectors, towards a scaled identity matrix. By optimising an objective in the semi-Riemannian manifold with Centralised Kernel Alignment (CKA), we are able to search for the optimal shrinkage parameter, and provide a post-processing method to smooth the spectrum of learnt word vectors which yields improved performance on downstream tasks.
Series This talk is part of the Language Technology Lab Seminars series.
Included in Lists
- bld31
- Board room, Faculty of English, 9 West Rd (Sidgwick Site)
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- Guy Emerson's list
- Interested Talks
- Language Sciences for Graduate Students
- Language Technology Lab Seminars
- ndk22's list
- ob366-ai4er
- rp587
- Simon Baker's List
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Shuai Tang, University of California, San Diego
Thursday 17 October 2019, 11:00-12:00