DocumentCode :
789329
Title :
Robust feature-estimation and objective quality assessment for noisy speech recognition using the Credit Card corpus
Author :
Hansen, John H L ; Arslan, Levent M.
Author_Institution :
Dept. of Electr. Eng., Duke Univ., Durham, NC, USA
Volume :
3
Issue :
3
fYear :
1995
fDate :
5/1/1995 12:00:00 AM
Firstpage :
169
Lastpage :
184
Abstract :
The introduction of acoustic background distortion into speech causes recognition algorithms to fail. In order to improve the environmental robustness of speech recognition in adverse conditions, a novel constrained-iterative feature-estimation algorithm is considered and shown to produce improved feature characterization in a variety of actual noise conditions. In addition, an objective measure based MAP estimator is formulated as a means of predicting changes in robust recognition performance at the speech feature extraction stage. The four measures considered include (i) NIST SNR; (ii) Itakura-Saito log-likelihood; (iii) log-area-ratio; (iv) the weighted-spectral slope measure. A continuous distribution, monophone based, hidden Markov model recognition algorithm is used for objective measure based MAP estimator analysis and recognition evaluation. Evaluations were based on speech data from the Credit Card corpus (CC-DATA). It is shown that feature enhancement provides a consistent level of recognition improvement for broadband, and low-frequency colored noise sources. As the stationarity assumption for a given noise source breaks down, the ability of feature enhancement to improve recognition performance decreases. Finally, the log-likelihood based MAP estimator was found to be the best predictor of recognition performance, while the NIST SNR based MAP estimator was found to be poorest recognition predictor across the 27 noise conditions considered
Keywords :
feature extraction; hidden Markov models; iterative methods; maximum likelihood estimation; speech recognition; Credit Card corpus; Itakura-Saito log-likelihood; MAP estimator analysis; NIST SNR; acoustic background distortion; adverse conditions; broadband noise sources; constrained-iterative feature-estimation algorithm; continuous distribution monophone based hidden Markov model recognition algorithm; feature enhancement; log-area-ratio; low-frequency colored noise sources; noisy speech recognition; objective measure based MAP estimator; objective quality assessment; robust feature-estimation; speech feature extraction; stationarity assumption; weighted-spectral slope measure; Acoustic distortion; Acoustic measurements; Acoustic noise; Distortion measurement; NIST; Noise robustness; Quality assessment; Signal to noise ratio; Speech recognition; Working environment noise;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.388143
Filename :
388143
Link To Document :
بازگشت