DocumentCode :
1686299
Title :
Investigating deep neural network based transforms of robust audio features for LVCSR
Author :
Bocchieri, Enrico ; Dimitriadis, Dimitrios
Author_Institution :
AT&T Res., Florham Park, NJ, USA
fYear :
2013
Firstpage :
6709
Lastpage :
6713
Abstract :
Micro-modulation components such as the formant frequencies are very important characteristics of spoken speech that have allowed great performance improvements in small-vocabulary ASR tasks. Yet they have limited use in large vocabulary ASR applications. To enable the successful application, in real-life tasks, of these frequency measures, we investigate their combination with traditional features (MFCC´s and PLP´s) by linear (e.g. HDA), and non-linear (bottleneck MLP) feature transforms. Our experiments show that such integration, using non-linear MLP-based transforms, of micro-modulation and cepstral features greatly improves the ASR with respect to the cepstral features alone. We have applied this novel feature extraction scheme onto two very different tasks, i.e. a clean speech task (DARPA-WSJ) and a real-life, open-vocabulary, mobile search task (Speak4itSM), always reporting improved performance. We report relative error rate reduction of 15% for the Speak4itSM task, and similar improvements, up to 21%, for the WSJ task.
Keywords :
feature extraction; neural nets; speech recognition; transforms; DARPA-WSJ; LVCSR; Speak4itSM; cepstral features; deep neural network based transforms; feature extraction scheme; feature transforms; formant frequencies; frequency measures; large vocabulary ASR applications; micromodulation components; mobile search task; nonlinear MLP-based transforms; open-vocabulary; robust audio features; small-vocabulary ASR tasks; spoken speech; Accuracy; Feature extraction; Hidden Markov models; Speech; Speech recognition; Training; Transforms; Neural networks; feature extraction; robustness; speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
ISSN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2013.6638960
Filename :
6638960
Link To Document :
بازگشت