BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:The applications of discrete speech tokens for robust and context-
 aware text-to-speech synthesis - Chenpeng Du
DTSTART:20231211T120000Z
DTEND:20231211T130000Z
UID:TALK208156@talks.cam.ac.uk
CONTACT:Simon Webster McKnight
DESCRIPTION:In a conventional neural text-to-speech (TTS) pipeline\, there
  are typically two stages: firstly\, the prediction of a mel-spectrogram f
 rom text through an acoustic model\, followed by the generation of wavefor
 m data from the mel-spectrogram with a vocoder. However\, such systems oft
 en suffer from suboptimal quality and sensitivity to the quality of the tr
 aining data. We propose for the first time to leverage discrete speech tok
 ens from self-supervised models as the intermediate feature of TTS pipelin
 e\, leading to a significant improvement in the robustness. Building upon 
 this novel pipeline\, we extend its applications to context-aware TTS task
 s\, where speech coherence with the context is taken into account during t
 he speech generation process.
LOCATION:In-person for Cambridge University members only: JDB Teaching Roo
 m\, Engineering Department
END:VEVENT
END:VCALENDAR
