“End-to-end multi-speaker neural TTS with LLM-based prosody prediction”
- 👤 Speaker: Penny Karanasou, Amazon R&D
- 📅 Date & Time: Monday 29 January 2024, 12:00 - 13:00
- 📍 Venue: Hybrid: JDB Teaching Room, Engineering Department or Zoom: https://cam-ac-uk.zoom.us/j/87012963681?pwd=bXRwNis2SW93aHhxUndScnp2MUVTQT09
Abstract
In recent years, Neural Text-to-Speech (NTTS) has revolutionised the TTS field and resulted in more natural, more expressive speech. In Amazon with products like Alexa and AWS Polly services, we are bringing generated speech of tens of voices and languages to millions of people. In Amazon TTS Research we are tackling a variety of research problems, from generative TTS , prosody transfer, neural front-end to machine dubbing and on-device TTS . In this presentation I will focus on part of the research of my team as published in Interspeech and SSW 2023 . First, I will give you a summary of the Amazon TTS papers presented in these two conferences in 2023. I will then present our work on eCat, a novel end-to-end multi-speaker model capable of: a) generating long-context speech with expressive and contextually appropriate prosody, and b) performing fine-grained prosody transfer between any pair of seen speakers. eCat improves TTS performance over our previous internal baselines, and when compared to VITS , a state-of-the-art TTS model, it is statistically significantly preferred. I will continue will a comparative study of fifteen pretrained language models for two TTS tasks: prosody prediction and pause prediction. Our findings revealed a logarithmic relationship between model size and quality, as well as significant performance differences between neutral and expressive prosody.
Series This talk is part of the CUED Speech Group Seminars series.
Included in Lists
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- CUED Speech Group Seminars
- Guy Emerson's list
- Hybrid: JDB Teaching Room, Engineering Department or Zoom: https://cam-ac-uk.zoom.us/j/87012963681?pwd=bXRwNis2SW93aHhxUndScnp2MUVTQT09
- Information Engineering Division seminar list
- PhD related
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Monday 29 January 2024, 12:00-13:00