BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Efficient Transformers with Dynamic Token Pooling - Piotr Nawrot\,
  University of Edinburgh
DTSTART:20230302T110000Z
DTEND:20230302T120000Z
UID:TALK197941@talks.cam.ac.uk
CONTACT:Panagiotis Fytas
DESCRIPTION:Transformers achieve unrivalled performance in modelling langu
 age\, but remain inefficient in terms of memory and time complexity. A pos
 sible remedy is to\nreduce the sequence length in the intermediate layers 
 by pooling fixed-length segments of tokens. Nevertheless\, natural units o
 f meaning\, such as words or\nphrases\, display varying sizes. To address 
 this mismatch\, we equip language models with a dynamic-pooling mechanism\
 , which predicts segment boundaries in an autoregressive fashion. We compa
 re several methods to infer boundaries\, including end-to-end learning thr
 ough stochastic re-parameterisation\, supervised learning (based on segmen
 tations from subword tokenizers or spikes in conditional entropy)\, as wel
 l as linguistically motivated boundaries. We perform character-level evalu
 ation on texts from multiple datasets and morphologically diverse language
 s. The results demonstrate that dynamic pooling\, which jointly segments a
 nd models language\, is often both faster and more accurate than vanilla T
 ransformers and fixed-length pooling within the same computational budget.
LOCATION:GR04\, English Faculty Building\, 9 West Road\, Sidgwick Site
END:VEVENT
END:VCALENDAR