Title :
Developments in continuous speech dictation using the 1995 ARPA NAB news task
Author :
Gauvain, J.-L. ; Lamel, L. ; Adda, G. ; Matrouf, D.
Author_Institution :
Lab. d´´Inf. pour la Mecanique et les Sci. de l´´Ingenieur, CNRS, Orsay, France
Abstract :
We report on the LIMSI recognizer evaluated in the ARPA 1995 North American Business (NAB) news benchmark test. In contrast to previous evaluations, the new Hub 3 test aims at improving basic SI, CSR performance on unlimited-vocabulary read speech recorded under more varied acoustical conditions (background environmental noise and unknown microphones). The LIMSI recognizer is an HMM-based system with a Gaussian mixture. Decoding is carried out in multiple forward acoustic passes, where more refined acoustic and language models are used in successive passes and information is transmitted via word graphs. In order to deal with the varied acoustic conditions, channel compensation is performed iteratively, refining the noise estimates before the first three decoding passes. The final decoding pass is carried out with speaker-adapted models obtained via unsupervised adaptation using the MLLR method. On the Sennheiser microphone (average SNR 29 dB) a word error of 9.1% was obtained, which can be compared to 17.5% on the secondary microphone data (average SNR 15 dB) using the same recognition system
Keywords :
Gaussian processes; acoustic signal processing; decoding; dictation; hidden Markov models; natural languages; speech processing; speech recognition; 1995 ARPA NAB news task; Gaussian mixture; HMM based system; Hub 3 test; LIMSI recognizer; MLLR method; North American Business news test; acoustic conditions; acoustic models; acoustical conditions; background environmental noise; channel compensation; continuous speech dictation; decoding; language models; microphones; multiple forward acoustic passes; noise estimates; speaker adapted models; unlimited vocabulary read speech; unsupervised adaptation; word graphs; Acoustic noise; Acoustic testing; Benchmark testing; Hidden Markov models; Iterative decoding; Maximum likelihood linear regression; Microphones; Speech analysis; Speech enhancement; Working environment noise;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on
Conference_Location :
Atlanta, GA
Print_ISBN :
0-7803-3192-3
DOI :
10.1109/ICASSP.1996.540293