DocumentCode :
142196
Title :
Noise suppression based on nonnegative matrix factorization for robust speech recognition
Author :
Hao-teng Fan ; Pao-han Lin ; Jeih-weih Hung
Author_Institution :
Dept. of Electr. Eng., Nat. Chi Nan Univ., Nantou, Taiwan
Volume :
3
fYear :
2014
fDate :
26-28 April 2014
Firstpage :
1732
Lastpage :
1736
Abstract :
This paper presents a novel noise robustness method, nonnegative matrix factorization-based noise suppression (NNS), to enhance the magnitude spectrum of speech signals for better speech recognition performance in noise-corrupted environments. In the presented approach, the clean data and noise in the training set are firstly converted to the spectrograms via short-time Fourier transform (STFT), and the basis spectral matrices of the speech data and noise are learned from the corresponding spectrograms accordingly. Then, the magnitude spectrogram of the noise-corrupted testing data is factorized via the basis matrices of the clean data, and the resulting noise components are alleviated from the original magnitude spectrogram. Finally, the new noise-reduced magnitude spectrogram is integrated with the original noisy phase spectrogram and then converted back to a timedomain signal, which is subsequently converted to a sequence of MFCC speech features. By using the presented NNS as a pre-processing stage of the speech recognition system, the obtained recognition accuracy can outperform the MFCC baseline especially at median and low SNR cases. Furthermore, performing NNS on the different sub-band spectrograms can further improve the recognition results relative to the original NNS performing on the full-band spectrogram, indicating that sub-band NNS can produce more robust speech features suitable for noisy speech recognition.
Keywords :
fast Fourier transforms; matrix decomposition; speech recognition; MFCC baseline; MFCC speech features; basis spectral matrices; full-band spectrogram; magnitude spectrogram; magnitude spectrum; noise components; noise robustness method; noise suppression; noise-corrupted environments; noise-corrupted testing data; noise-reduced magnitude spectrogram; noisy phase spectrogram; noisy speech recognition; nonnegative matrix factorization; robust speech features; robust speech recognition; short-time Fourier transform; spectrograms; speech data; speech recognition performance; speech recognition system; speech signals; subband spectrograms; time-domain signal; Signal to noise ratio; Spectrogram; Speech; Speech enhancement; Speech recognition; noise suppression; noise-robustness; nonnegative matrix factorization; speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Science, Electronics and Electrical Engineering (ISEEE), 2014 International Conference on
Conference_Location :
Sapporo
Print_ISBN :
978-1-4799-3196-5
Type :
conf
DOI :
10.1109/InfoSEEE.2014.6946219
Filename :
6946219
Link To Document :
بازگشت