Title :
Early fusion of Sparse Classification and GMM for noise robust ASR
Author :
Yang Sun ; Gemmeke, Jort F. ; Cranen, Bert ; ten Bosch, Louis ; Boves, Lou
Author_Institution :
Centre for Language & Speech Technol., Radboud Univ. Nijmegen, Nijmegen, Netherlands
fDate :
Aug. 29 2011-Sept. 2 2011
Abstract :
In previous work we have shown that an ASR system consisting of a dual-input DBN which simultaneously observes MFCC acoustic features and predicted phone labels that are generated by an exemplar-based Sparse Classification (SC) system can achieve better word recognition accuracies in noise than a system observing only one of those input streams. This paper explores two modifications of the SC input to further improve the noise robustness of the dual-input DBN system: 1) integrating more time context and 2) using N best states. Experiments on AURORA-2 reveal that the first approach significantly improves the recognition results at almost all SNRs, but particularly in the more noisy conditions, achieving up to 6.1% (absolute) accuracy gain at SNR -5 dB. The second modification shows that there is an optimal N which allows the maximum attainable accuracy to be even further improved with another 11.8% at -5 dB.
Keywords :
Gaussian processes; mixture models; sensor fusion; signal classification; speech recognition; AURORA-2; GMM; Gaussian mixture models; MFCC acoustic features; SC system; automatic speech recognition; dual-input DBN system; exemplar-based sparse classification system; noise robust ASR system; phone labels; sparse classification early fusion; word recognition accuracy; Accuracy; Hidden Markov models; Noise; Noise measurement; Speech; Speech recognition; Vectors;
Conference_Titel :
Signal Processing Conference, 2011 19th European
Conference_Location :
Barcelona