Title :
An alternative front-end for the AT&T WATSON LV-CSR system
Author :
Dimitriadis, Dimitrios ; Bocchieri, Enrico ; Caseiro, Diamantino
Author_Institution :
AT&T Res., Florham Park, NJ, USA
Abstract :
In previously published work, we have proposed a novel feature extraction algorithm, based on the Teager-Kaiser energy estimates, that approximates human auditory characteristics and that is more robust to sub-band noise than the mean-square estimates of standard MFCCs. We refer to the novel features as Teager energy cepstrum coefficients (TECC). Herein, we study the TECC performance under additive noise and suggest how to predict the noisy TECC deviations by estimating the subband SNR values. Then, we report on the effectiveness of the TECCs when they are used hi the acoustic front-end of the state-of-the-art AT&T WATSON large-vocabulary recognizer. The TECC front-end is tested in the real-life voice-search Speak4it application for mobile devices. It provides a 6% relative word error rate reduction w.r.t. the MFCC front-end, using the same high performance language model, lexicon and acoustic model training.
Keywords :
mean square error methods; speech recognition; AT&T Watson LV-CSR system; MFCC; SNR value; TECC deviations; Teager-Kaiser; acoustic model training; alternative front-end; feature extraction algorithm; high performance language model; large-vocabulary recognizer; mean-square estimation; mobile devices; teager energy cepstium coefficients; Hidden Markov models; Mel frequency cepstral coefficient; Noise; Noise measurement; Robustness; Speech; cepstrum analysis; error analysis; parameter estimation; robustness; speech processing; speech recognition;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2011.5947351