DocumentCode :
2838415
Title :
A new eigenvoice approach to speaker adaptation
Author :
Huang, Chih-Hsien ; Chien, Jen-Tzung ; Hsin-Min Wang
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Cheng Kung Univ., Tainan, Taiwan
fYear :
2004
fDate :
15-18 Dec. 2004
Firstpage :
109
Lastpage :
112
Abstract :
In this paper, we present two approaches to improve the eigenvoice-based speaker adaptation. First, we present the maximum a posteriori eigen-decomposition (MAPED), where the linear combination coefficients for eigenvector decomposition are estimated according to the MAP criterion. By incorporating the prior decomposition knowledge, here we use a Gaussian distribution, the MAPED is established accordingly. MAPED is able to achieve better performance than maximum likelihood eigen-decomposition (MLED) with few adaptation data. On the other hand, we exploit the adaptation of covariance matrices of the hidden Markov model (HMM) in the eigenvoice framework. Our method is to use the principal component analysis (PCA) to project the speaker-specific HMM parameters onto a smaller orthogonal feature space. Then, we reliably calculate the HMM covariance matrices using the observations in the reduced feature space. The adapted HMM covariance matrices are estimated by transforming the covariance matrices in the reduced feature space to that in the original feature space. The experimental results show that the eigenvoice speaker adaptation using MAPED and incorporating covariance adaptation can improve the performance of the original eigenvoice adaptation in Mandarin speech recognition.
Keywords :
Gaussian distribution; adaptive estimation; covariance matrices; eigenvalues and eigenfunctions; feature extraction; hidden Markov models; maximum likelihood estimation; principal component analysis; speech recognition; Gaussian distribution; HMM; MAP estimation; MAPED; Mandarin speech recognition; PCA; covariance matrices; eigenvector decomposition; eigenvoice; hidden Markov model; linear combination coefficients; maximum a posteriori eigen-decomposition; orthogonal feature space; principal component analysis; prior decomposition knowledge; speaker adaptation; speaker-specific HMM parameters; Computer science; Covariance matrix; Gaussian distribution; Hidden Markov models; Information science; Loudspeakers; Maximum likelihood estimation; Maximum likelihood linear regression; Principal component analysis; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing, 2004 International Symposium on
Print_ISBN :
0-7803-8678-7
Type :
conf
DOI :
10.1109/CHINSL.2004.1409598
Filename :
1409598
Link To Document :
بازگشت