Title :
Eigen-MLLRs applied to unsupervised speaker enrollment for large vocabulary continuous speech recognition
Author :
Aubert, Xavier L.
Author_Institution :
Philips Res. Lab., Aachen, Germany
Abstract :
The concept of eigen-MLLRs (Kuan-ting Chen et al., Proc. ICSLP 2000, p.742-5, 2000; Wang, N. J.-C. et al., Proc. ICASSP 2001, p.345-9, 2001), a variant of the eigenvoice method, is applied to unsupervised speaker enrollment in a large vocabulary CSR system. The emphasis is on fast adaptation. Two ways of estimating multiple eigen-MLLR transformations are introduced, either joint or separated with respect to the eigen-MLLR vector space. The first case allows multiple transforms to be robustly estimated from sparse data while the second achieves more accurate adaptation when more samples become available. The first decoded words spoken by a new test speaker are used to adapt the speaker-independent HMM means. The impact of this new enrollment algorithm is evaluated over a large real-life database dealing with professional medical transcriptions. Significant reductions of word-error-rates are achieved with less than 10 seconds of enrollment speech and without any supervision.
Keywords :
eigenvalues and eigenfunctions; error statistics; hidden Markov models; natural language interfaces; parameter estimation; speech recognition; unsupervised learning; continuous speech recognition; eigen-MLLR; eigenvoice; large vocabulary speech recognition; multiple transform estimation; professional medical transcriptions; sparse data; speaker-independent HMM means; unsupervised speaker enrollment; word-error-rates; Automatic speech recognition; Decoding; Hidden Markov models; Loudspeakers; Maximum likelihood linear regression; Principal component analysis; Robustness; Speech recognition; Training data; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
Print_ISBN :
0-7803-8484-9
DOI :
10.1109/ICASSP.2004.1325994