SummaryMixing: A Linear-Time Attention Alternative
- đ¤ Speaker: Shucong Zhang, Samsung AI Center
- đ Date & Time: Monday 21 October 2024, 12:00 - 13:00
- đ Venue: Hybrid: JDB Teaching Room, Engineering Department or Zoom: https://cam-ac-uk.zoom.us/j/87165608116?pwd=llCLiWvBAfOR7RtealbVtVbjXruh3O.1
Abstract
Modern speech processing systems rely on self-attention. Unfortunately, self-attention takes quadratic time in the length of the speech utterance, causing inference and training on long sequences to be slower and consume more memory. Though cheaper alternatives to self-attention for speech recognition have been developed, they degrade performance. We propose a novel linear-time alternative to self-attention that, for the first time, does reach better accuracy. Our model, SummaryMixing, computes a mean over the whole utterance and feeds this summary back to each time step.Experiments are performed in three vital scenarios: an encoder-decoder offline model; an online streaming Transducer model; and a self-supervised model. In all three scenarios, SummaryMixing gives equal or better accuracy than self-attention, at lower cost.
Series This talk is part of the CUED Speech Group Seminars series.
Included in Lists
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- CUED Speech Group Seminars
- Guy Emerson's list
- Hybrid: JDB Teaching Room, Engineering Department or Zoom: https://cam-ac-uk.zoom.us/j/87165608116?pwd=llCLiWvBAfOR7RtealbVtVbjXruh3O.1
- Information Engineering Division seminar list
- PhD related
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Monday 21 October 2024, 12:00-13:00