Statistical Parametric Speech Synthesis Based on Speaker and Language Factorization
- đ¤ Speaker: Heiga Zen (Toshiba Research Europe Ltd.)
- đ Date & Time: Tuesday 21 June 2011, 13:00 - 14:30
- đ Venue: Cambridge University Engineering Department, Lecture Room 11
Abstract
An increasingly common scenario in building hidden Markov model-based speech synthesis and recognition systems is training on inhomogeneous data. For example, data from multiple different sources and/or different types of data are used. This seminar introduces a new technique for training hidden Markov models on such inhomogeneous speech data, in this case including speaker and language variations. The proposed technique, speaker and language factorization, attempts to factorize speaker-specific/language-specific characteristics in the data and model them by individual transforms. Language-specific factors in the data are represented by transforms based on cluster mean interpolation with cluster-dependent decision trees. Acoustic variations caused by speaker characteristics are handled by transforms based on constrained maximum likelihood linear regression. This technique allows multi-speaker/multi-language adaptive training to be performed. Since each factor is represented by an individual transform, it is possible to factor-in only one of them. Experimental results on statistical parametric speech synthesis show that the proposed technique enables the speaker and language to be factorized, allowing the speaker transform estimated in one language to be successfully used to synthesize speech in different language while keeping the voice characteristics.
Series This talk is part of the speech synthesis seminar series series.
Included in Lists
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Cambridge University Engineering Department, Lecture Room 11
- Chris Davis' list
- CUED Speech Group Seminars
- Guy Emerson's list
- Information Engineering Division seminar list
- PhD related
- speech synthesis seminar series
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Tuesday 21 June 2011, 13:00-14:30