University of Cambridge > Talks.cam > CUED Speech Group Seminars > Prosody transfer evaluation and temporal prosody control in speech synthesis

Log in

Google

Microsoft

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Prosody transfer evaluation and temporal prosody control in speech synthesis

Download to your calendar using vCal

Papercup
Tuesday 06 July 2021, 12:00-13:00
Zoom: https://zoom.us/j/95352633552?pwd=RzJVK2UzOGZyNU5mVHd1Y1VPT2tDUT09.

If you have a question about this talk, please contact Dr Kate Knill .

This seminar will take place on zoom

Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis

Abstract: We propose a model that generates speech explicitly conditioned on the three primary acoustic correlates of prosody: F0, energy and duration. The model is flexible about how the values of these features are specified: they can be externally provided, or predicted from text, or predicted then subsequently modified. Compared to a model that employs a variational auto-encoder to learn unsupervised latent features, our model provides more interpretable, temporally-precise, and disentangled control.

ADEPT: A Dataset for Evaluating Prosody Transfer

Abstract: We introduce an English corpus of prosodically-varied reference natural speech samples for evaluating prosody transfer. The samples include global and local variations across utterances. The corpus only includes prosodic variations that listeners are able to distinguish with reasonable accuracy, and we report these figures as a benchmark against which text-to-speech prosody transfer can be compared. We also propose a subjective prosody transfer evaluation methodology.

Speaker bios:

Tian Huey Teh is a machine learning engineer at Papercup, based in London. She completed the MSc Computational Statistics and Machine Learning programme at University College London in 2018. Since graduating she has been working on TTS research and development, focusing on prosody modelling and scaling systems across languages.

Alexandra Torresquintero is a Data Engineer on the machine learning team at Papercup. She completed her MSc in Speech and Language processing at the University of Edinburgh in 2019. Whilst at Papercup, she has worked on formalising the processing behind the TTS training data, including Linguistic Frontend optimisations, research into g2p modelling, and building a database to store our data.

This talk is part of the CUED Speech Group Seminars series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Prosody transfer evaluation and temporal prosody control in speech synthesis

📅 Download to calendar (vCal)

⚠️ Important: This seminar will take place on zoom

👤 Speaker: Papercup
📅 Date & Time: Tuesday 06 July 2021, 12:00 - 13:00
📍 Venue: Zoom: https://zoom.us/j/95352633552?pwd=RzJVK2UzOGZyNU5mVHd1Y1VPT2tDUT09

Questions? Contact Dr Kate Knill

Abstract

Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis

ADEPT: A Dataset for Evaluating Prosody Transfer

Speaker bios:

Series This talk is part of the CUED Speech Group Seminars series.

Included in Lists

Note: Ex-directory lists are not shown.

Log in

🔐 Log In

Information on

ℹ️ Information

Prosody transfer evaluation and temporal prosody control in speech synthesis

This talk is included in these lists:

Prosody transfer evaluation and temporal prosody control in speech synthesis

Abstract

Included in Lists

Log in

🔐 Log In

Information on

ℹ️ Information

Prosody transfer evaluation and temporal prosody control in speech synthesis

This talk is included in these lists:

Other lists

Other talks

Prosody transfer evaluation and temporal prosody control in speech synthesis

Abstract

Included in Lists