DocumentCode :
940007
Title :
Rapid discriminative acoustic model based on eigenspace mapping for fast speaker adaptation
Author :
Zhou, Bowen ; Hansen, John H L
Author_Institution :
Robust Speech Process. Group, Univ. of Colorado, Boulder, CO, USA
Volume :
13
Issue :
4
fYear :
2005
fDate :
7/1/2005 12:00:00 AM
Firstpage :
554
Lastpage :
564
Abstract :
It is widely believed that strong correlations exist across an utterance as a consequence of time-invariant characteristics of speaker and acoustic environments. It is verified in this paper that the first primary eigendirections of the utterance covariance matrix are speaker dependent. Based on this observation, a novel family of fast speaker adaptation algorithms entitled Eigenspace Mapping (EigMap) is proposed. The proposed algorithms are applied to continuous density Hidden Markov Model (HMM) based speech recognition. The EigMap algorithm rapidly constructs discriminative acoustic models in the test speaker´s eigenspace by preserving discriminative information learned from baseline models in the directions of the test speaker´s eigenspace. Moreover, the adapted models are compressed by discarding model parameters that are assumed to contain no discrimination information. The core idea of EigMap can be extended in many ways, and a family of algorithms based on EigMap is described in this paper. Unsupervised adaptation experiments show that EigMap is effective in improving baseline models using very limited amounts of adaptation data with superior performance to conventional adaptation techniques such as MLLR and block diagonal MLLR. A relative improvement of 18.4% over a baseline recognizer is achieved using EigMap with only about 4.5 s of adaptation data. Furthermore, it is also demonstrated that EigMap is additive to MLLR by encompassing important speaker dependent discriminative information. A significant relative improvement of 24.6% over baseline is observed using 4.5 s of adaptation data by combining MLLR and EigMap techniques.
Keywords :
covariance matrices; eigenvalues and eigenfunctions; hidden Markov models; speech recognition; eigenspace mapping; fast speaker adaptation; hidden Markov model; rapid discriminative acoustic model; speaker dependent discriminative information; speech recognition; utterance covariance matrix; Acoustic testing; Additives; Covariance matrix; Hidden Markov models; Linear regression; Loudspeakers; Maximum likelihood linear regression; Robustness; Speech processing; Speech recognition; Discriminative acoustic model; eigenspace mapping; hidden Markov models; rapid speaker adaptation; speech recognition;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/TSA.2005.845808
Filename :
1453598
Link To Document :
بازگشت