BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Attention Forcing: Improving attention-based sequence-to-sequence 
 models - Qingyun Dou\, University of Cambridge
DTSTART:20230330T130000Z
DTEND:20230330T140000Z
UID:TALK199087@talks.cam.ac.uk
CONTACT:Dr Kate Knill
DESCRIPTION:Autoregressive sequence-to-sequence models with attention mech
 anisms have achieved state-of-the-art performance in various tasks includi
 ng Neural Machine Translation (NMT)\, Automatic Speech Recognition (ASR) a
 nd Text-To-Speech (TTS). This talk introduces attention forcing\, a group 
 of training approaches\, to address a training-inference mismatch. For aut
 oregressive models\, the most standard training approach\, teacher forcing
 \, guides a model with the reference output history. However during infere
 nce the generated output history must be used. To reduce the mismatch\, at
 tention forcing guides the model with the generated output history and ref
 erence attention. Extensions of this general framework will be introduced 
 for more challenging applications. For example\, most approaches addressin
 g the training-inference mismatch are incompatible with parallel training\
 , which is essential for Transformer models. In contrast\, the parallel ve
 rsion of attention forcing supports parallel training\, and hence Transfor
 mer models. The effectiveness of attention forcing will be demonstrated by
  the experiments in TTS and NMT.\n 
LOCATION: Hybrid: LT6\, First floor Baker building\, Engineering Dept or Z
 oom:  https://eng-cam.zoom.us/j/89657740934?pwd=d1RUR29PenZXUlFQNVNVeU8zN2
 xoUT09
END:VEVENT
END:VCALENDAR