DocumentCode :
3145379
Title :
Combining vocal tract length normalization with hierarchial linear transformations
Author :
Saheer, Lakshmi ; Yamagishi, Junichi ; Garner, Philip N. ; Dines, John
Author_Institution :
Idiap Res. Inst., Martigny, Switzerland
fYear :
2012
fDate :
25-30 March 2012
Firstpage :
4493
Lastpage :
4496
Abstract :
Recent research has demonstrated the effectiveness of vocal tract length normalization (VTLN) as a rapid adaptation technique for statistical parametric speech synthesis. VTLN produces speech with naturalness preferable to that of MLLR-based adaptation techniques, being much closer in quality to that generated by the original average voice model. However with only a single parameter, VTLN captures very few speaker specific characteristics when compared to linear transform based adaptation techniques. This paper proposes that the merits of VTLN can be combined with those of linear transform based adaptation in a hierarchial Bayesian framework, where VTLN is used as the prior information. A novel technique for propagating the gender information from the VTLN prior through constrained structural maximum a posteriori linear regression (CSMAPLR) adaptation is presented. Experiments show that the resulting transformation has improved speech quality with better naturalness, intelligibility and improved speaker similarity.
Keywords :
Bayes methods; speech intelligibility; CSMAPLR adaptation; MLLR based adaptation technique; constrained structural maximum a posteriori linear regression; hierarchial Bayesian framework; hierarchial linear transformation; intelligibility; rapid adaptation technique; speaker similarity; statistical parametric speech synthesis; vocal tract length normalization; Adaptation models; Estimation; Hidden Markov models; Speech; Speech synthesis; Transforms; Vectors; Statistical parametric speech synthesis; constrained structural maximum a posteriori linear regression; hidden Markov models; speaker adaptation; vocal tract length normalization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
ISSN :
1520-6149
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2012.6287948
Filename :
6287948
Link To Document :
بازگشت