DocumentCode :
3528017
Title :
Full covariance state duration modeling for HMM-based speech synthesis
Author :
Lu, Heng ; Wu, Yi-Jian ; Tokuda, Keiichi ; Dai, Li-Rong ; Wang, Ren-Hua
Author_Institution :
iFlytek Speech Lab., Univ. of Sci. & Technol. of China, Hefei
fYear :
2009
fDate :
19-24 April 2009
Firstpage :
4033
Lastpage :
4036
Abstract :
This paper proposes a state duration modeling method using full covariance matrix for HMM-based speech synthesis. In this method, a full covariance matrix instead of the conventional diagonal covariance matrix is adopted in the multi-dimensional Gaussian distribution to model the state duration of each context-dependent phoneme. At synthesis stage, the state durations are predicted using the clustered context-dependent distributions with full covariance matrices. Experimental results show that the synthesized speech using full-covariance state duration models is more natural than the conventional method when we change the speaking rate of synthesized speech.
Keywords :
Gaussian distribution; covariance matrices; hidden Markov models; pattern clustering; speech synthesis; HMM-based speech synthesis; clustered context-dependent distribution; context-dependent phoneme; full covariance matrix state duration modeling; hidden Markov model; multidimensional Gaussian distribution; Computer science; Context modeling; Covariance matrix; Flowcharts; Gaussian distribution; Hidden Markov models; High temperature superconductors; Predictive models; Speech synthesis; Training data; HMM; duration; full covariance; speech synthesis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
ISSN :
1520-6149
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2009.4960513
Filename :
4960513
Link To Document :
بازگشت