DocumentCode :
1559382
Title :
A robust compensation strategy for extraneous acoustic variations in spontaneous speech recognition
Author :
Jiang, Hui ; Deng, Li
Author_Institution :
Dept. of Electr. & Comput. Eng., Waterloo Univ., Ont., Canada
Volume :
10
Issue :
1
fYear :
2002
fDate :
1/1/2002 12:00:00 AM
Firstpage :
9
Lastpage :
17
Abstract :
We propose a robust compensation strategy to deal effectively with extraneous acoustic variations for spontaneous speech recognition. This strategy extends speaker adaptive training, and uses hidden Markov models (HMM) parameter transformations to normalize the extraneous variations in the training data according to a set of predefined conditions. A "compact" model and the associated prior probability density functions (PDFs) of transformation parameters are estimated using the maximum likelihood criterion. In the testing phase, the generic model and the prior PDFs are used to search for the unknown word sequence based on Bayesian prediction classification (BPC). The proposed strategy is evaluated in the switchboard task, and is used to deal with three types of extraneous variations and mismatch in conversational speech recognition: pronunciation variations, inter-speaker variability, and telephone handset mismatch. Experimental results show that moderate word error rate reduction is achieved in comparison with a well-trained baseline HMM system under identical experimental conditions
Keywords :
Bayes methods; hidden Markov models; maximum likelihood estimation; probability; speech recognition; Bayesian prediction classification; HMM parameter transformations; HMM system; compact model; conversational speech recognition; extraneous acoustic variations; generic model; hidden Markov models; inter-speaker variability; maximum likelihood criterion; prior PDF; prior probability density functions; pronunciation variations; robust compensation; robust decoding; speaker adaptive training; spontaneous speech recognition; switchboard task; telephone handset mismatch; testing phase; word error rate reduction; word sequence; Hidden Markov models; Loudspeakers; Maximum likelihood estimation; Parameter estimation; Predictive models; Probability density function; Robustness; Speech recognition; Testing; Training data;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.979381
Filename :
979381
Link To Document :
بازگشت