BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Statistical Parametric Speech Synthesis Based on Speaker and Langu
 age Factorization - Heiga Zen (Toshiba Research Europe Ltd.)
DTSTART:20110621T120000Z
DTEND:20110621T133000Z
UID:TALK31796@talks.cam.ac.uk
CONTACT:Kai Yu
DESCRIPTION:An increasingly common scenario in building hidden Markov mode
 l-based speech synthesis and recognition systems is training on inhomogene
 ous data.  For example\, data from multiple different sources and/or diffe
 rent types of data are used.  This seminar introduces a new technique for 
 training hidden Markov models on such inhomogeneous speech data\, in this 
 case including speaker and language variations. The proposed technique\, s
 peaker and language factorization\, attempts to factorize speaker-specific
 /language-specific characteristics in the data and model them by individua
 l transforms.  Language-specific factors in the data are represented by tr
 ansforms based on cluster mean interpolation with cluster-dependent decisi
 on trees.  Acoustic variations caused by speaker characteristics are handl
 ed by transforms based on constrained maximum likelihood linear regression
 .  This technique allows multi-speaker/multi-language adaptive training to
  be performed.  Since each factor is represented by an individual transfor
 m\, it is possible to factor-in only one of them.  Experimental results on
  statistical parametric speech synthesis show that the proposed technique 
 enables the speaker and language to be factorized\, allowing the speaker t
 ransform estimated in one language to be successfully used to synthesize s
 peech in different language while keeping the voice characteristics.\n
LOCATION:Cambridge University Engineering Department\, Lecture Room 11
END:VEVENT
END:VCALENDAR
