NMT Analysis: The Trade-Off Between Source and Target, and (a Bit of) the Training Process
- 👤 Speaker: Elena Voita (University of Edinburgh)
- 📅 Date & Time: Friday 18 June 2021, 12:00 - 13:00
- 📍 Venue: Virtual (Zoom)
Abstract
Join Zoom Meeting https://cl-cam-ac-uk.zoom.us/j/91424580226?pwd=WHFKRW1ORCtBck15SUVOdXowd29uUT09
Meeting ID: 914 2458 0226 Passcode: 333459
In Neural Machine Translation (and, more generally, conditional language modeling), the generation of a target token is influenced by two types of context: the source and the prefix of the target sequence. While many attempts to understand the internal workings of NMT models have been made, none of them explicitly evaluates relative source and target contributions to a generation decision. We propose a way to explicitly evaluate these relative source and target contributions to the generation process, and analyse NMT Transformer. When looking at changes in the contributions when conditioning on different types of prefixes, we show that models suffering from exposure bias are more prone to over-relying on target history (and hence to hallucinating) than the ones where the exposure bias is mitigated. Additionally, we analyze changes in the source and target contributions when varying the amount of training data, and during the training process. We find that models trained with more data tend to rely on source information more and to have more sharp token contributions; the training process is non-monotonic with several stages of different nature. If we have time, I’ll also talk about our ongoing work that takes a closer look at the phenomena learned during these training stages.
Series This talk is part of the NLIP Seminar Series series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- Computer Education Research
- Computing Education Research
- Department of Computer Science and Technology talks and seminars
- Graduate-Seminars
- Guy Emerson's list
- Interested Talks
- Language Sciences for Graduate Students
- ndk22's list
- NLIP Seminar Series
- ob366-ai4er
- PMRFPS's
- rp587
- School of Technology
- Simon Baker's List
- Trust & Technology Initiative - interesting events
- Virtual (Zoom)
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Elena Voita (University of Edinburgh)
Friday 18 June 2021, 12:00-13:00