DocumentCode :
2799045
Title :
Non-negative matrix factorization as noise-robust feature extractor for speech recognition
Author :
Schuller, Björn ; Weninger, Felix ; Wöllmer, Martin ; Sun, Yang ; Rigoll, Gerhard
Author_Institution :
Inst. for Human-Machine Commun., Tech. Univ. Munchen, München, Germany
fYear :
2010
fDate :
14-19 March 2010
Firstpage :
4562
Lastpage :
4565
Abstract :
We introduce a novel approach for noise-robust feature extraction in speech recognition, based on non-negative matrix factorization (NMF). While NMF has previously been used for speech denoising and speaker separation, we directly extract time-varying features from the NMF output. To this end we extend basic unsupervised NMF to a hybrid supervised/unsupervised algorithm. We present a Dynamic Bayesian Network (DBN) architecture that can exploit these features in a Tandem manner together with the maximum likelihood phoneme estimate of a bidirectional long short-term memory (BLSTM) recurrent neural network. We show that addition of NMF features to spelling recognition systems can increase word accuracy by up to 7% absolute in a noisy car environment.
Keywords :
belief networks; feature extraction; matrix decomposition; maximum likelihood estimation; recurrent neural nets; signal denoising; speech recognition; word processing; bidirectional long short term memory recurrent neural network; dynamic Bayesian network architecture; hybrid supervised-unsupervised algorithm; maximum likelihood estimation; noise robust feature extractor; noisy car environment; nonnegative matrix factorization; speaker separation; speech denoising; speech recognition; spelling recognition system; time varying feature extraction; unsupervised NMF; Acoustic noise; Automatic speech recognition; Bayesian methods; Feature extraction; Man machine systems; Noise reduction; Noise robustness; Recurrent neural networks; Signal processing; Speech recognition; Dynamic Bayesian Networks; Long Short-Term Memory; Noise robustness; Non-Negative Matrix Factorization; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
ISSN :
1520-6149
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2010.5495567
Filename :
5495567
Link To Document :
بازگشت