DocumentCode :
2701806
Title :
Stereo-Based Stochastic Mapping for Robust Speech Recognition
Author :
Afify, M. ; Xiaodong Cui ; Yuqing Gao
Author_Institution :
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
Volume :
4
fYear :
2007
fDate :
15-20 April 2007
Abstract :
We present a stochastic mapping technique for robust speech recognition that uses stereo data. The idea is based on building a GMM for the joint distribution of the clean and noisy channels during training and using an iterative compensation algorithm during testing. The proposed mapping was also interpreted as a mixture of linear transforms that are estimated in a special way using stereo data. The proposed method results in 28% relative improvement in string error rate (SER) for digit recognition in the car, and in about 10% relative improvement in word error rate (WER), when applied in conjunction with multi-style training (MST), for large vocabulary English speech recognition.
Keywords :
Gaussian processes; iterative methods; speech recognition; GMM; digit recognition; iterative compensation algorithm; linear transforms; multi-style training; robust speech recognition; stereo-based stochastic mapping; string error rate; vocabulary English speech recognition; word error rate; Error analysis; Iterative algorithms; Noise generators; Noise robustness; Speech recognition; Stochastic processes; Testing; Training data; Vocabulary; Working environment noise; Noise robustness; non-linear mapping; speech recognition; stereo-data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location :
Honolulu, HI
ISSN :
1520-6149
Print_ISBN :
1-4244-0727-3
Type :
conf
DOI :
10.1109/ICASSP.2007.366928
Filename :
4218116
Link To Document :
بازگشت