مرکز منطقه ای اطلاع رساني علوم و فناوري - Robust feature-estimation and objective quality assessment for noisy speech recognition using the Credit Card corpus

DocumentCode :

789329

Title :

Robust feature-estimation and objective quality assessment for noisy speech recognition using the Credit Card corpus

Author :

Hansen, John H L ; Arslan, Levent M.

Author_Institution :

Dept. of Electr. Eng., Duke Univ., Durham, NC, USA

Volume :

Issue :

fYear :

1995

fDate :

5/1/1995 12:00:00 AM

Firstpage :

169

Lastpage :

184

Abstract :

The introduction of acoustic background distortion into speech causes recognition algorithms to fail. In order to improve the environmental robustness of speech recognition in adverse conditions, a novel constrained-iterative feature-estimation algorithm is considered and shown to produce improved feature characterization in a variety of actual noise conditions. In addition, an objective measure based MAP estimator is formulated as a means of predicting changes in robust recognition performance at the speech feature extraction stage. The four measures considered include (i) NIST SNR; (ii) Itakura-Saito log-likelihood; (iii) log-area-ratio; (iv) the weighted-spectral slope measure. A continuous distribution, monophone based, hidden Markov model recognition algorithm is used for objective measure based MAP estimator analysis and recognition evaluation. Evaluations were based on speech data from the Credit Card corpus (CC-DATA). It is shown that feature enhancement provides a consistent level of recognition improvement for broadband, and low-frequency colored noise sources. As the stationarity assumption for a given noise source breaks down, the ability of feature enhancement to improve recognition performance decreases. Finally, the log-likelihood based MAP estimator was found to be the best predictor of recognition performance, while the NIST SNR based MAP estimator was found to be poorest recognition predictor across the 27 noise conditions considered

Keywords :

feature extraction; hidden Markov models; iterative methods; maximum likelihood estimation; speech recognition; Credit Card corpus; Itakura-Saito log-likelihood; MAP estimator analysis; NIST SNR; acoustic background distortion; adverse conditions; broadband noise sources; constrained-iterative feature-estimation algorithm; continuous distribution monophone based hidden Markov model recognition algorithm; feature enhancement; log-area-ratio; low-frequency colored noise sources; noisy speech recognition; objective measure based MAP estimator; objective quality assessment; robust feature-estimation; speech feature extraction; stationarity assumption; weighted-spectral slope measure; Acoustic distortion; Acoustic measurements; Acoustic noise; Distortion measurement; NIST; Noise robustness; Quality assessment; Signal to noise ratio; Speech recognition; Working environment noise;

fLanguage :

English

Journal_Title :

Speech and Audio Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1063-6676

Type :

jour

DOI :

10.1109/89.388143

Filename :

388143

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=789329