DocumentCode :
3164978
Title :
Non-negative matrix factorization for highly noise-robust ASR: To enhance or to recognize?
Author :
Weninger, Felix ; Wöllmer, Martin ; Geiger, Jürgen ; Schuller, Björn ; Gemmeke, Jort F. ; Hurmalainen, Antti ; Virtanen, Tuomas ; Rigoll, Gerhard
Author_Institution :
Inst. for Human-Machine Commun., Tech. Univ. Munchen, München, Germany
fYear :
2012
fDate :
25-30 March 2012
Firstpage :
4681
Lastpage :
4684
Abstract :
This paper proposes a multi-stream speech recognition system that combines information from three complementary analysis methods in order to improve automatic speech recognition in highly noisy and reverberant environments, as featured in the 2011 PASCAL CHiME Challenge. We integrate word predictions by a bidirectional Long Short-Term Memory recurrent neural network and non-negative sparse classification (NSC) into a multi-stream Hidden Markov Model using convolutive non-negative matrix factorization (NMF) for speech enhancement. Our results suggest that NMF-based enhancement and NSC are complementary despite their overlap in methodology, reaching up to 91.9% average keyword accuracy on the Challenge test set at signal-to-noise ratios from -6 to 9 dB-the best result reported so far on these data.
Keywords :
hidden Markov models; matrix decomposition; recurrent neural nets; speech enhancement; speech recognition; automatic speech recognition; average keyword accuracy; bidirectional long short term memory recurrent neural network; challenge test set; complementary analysis method; convolutive nonnegative matrix factorization; multistream hidden Markov model; multistream speech recognition; noise robust ASR; nonnegative sparse classification; signal-to-noise ratio; speech enhancement; word predictions; Hidden Markov models; Mel frequency cepstral coefficient; Noise; Speech; Speech enhancement; Speech recognition; Training; Non-Negative Matrix Factorization; Tandem Speech Recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
ISSN :
1520-6149
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2012.6288963
Filename :
6288963
Link To Document :
بازگشت