• DocumentCode
    142196
  • Title

    Noise suppression based on nonnegative matrix factorization for robust speech recognition

  • Author

    Hao-teng Fan ; Pao-han Lin ; Jeih-weih Hung

  • Author_Institution
    Dept. of Electr. Eng., Nat. Chi Nan Univ., Nantou, Taiwan
  • Volume
    3
  • fYear
    2014
  • fDate
    26-28 April 2014
  • Firstpage
    1732
  • Lastpage
    1736
  • Abstract
    This paper presents a novel noise robustness method, nonnegative matrix factorization-based noise suppression (NNS), to enhance the magnitude spectrum of speech signals for better speech recognition performance in noise-corrupted environments. In the presented approach, the clean data and noise in the training set are firstly converted to the spectrograms via short-time Fourier transform (STFT), and the basis spectral matrices of the speech data and noise are learned from the corresponding spectrograms accordingly. Then, the magnitude spectrogram of the noise-corrupted testing data is factorized via the basis matrices of the clean data, and the resulting noise components are alleviated from the original magnitude spectrogram. Finally, the new noise-reduced magnitude spectrogram is integrated with the original noisy phase spectrogram and then converted back to a timedomain signal, which is subsequently converted to a sequence of MFCC speech features. By using the presented NNS as a pre-processing stage of the speech recognition system, the obtained recognition accuracy can outperform the MFCC baseline especially at median and low SNR cases. Furthermore, performing NNS on the different sub-band spectrograms can further improve the recognition results relative to the original NNS performing on the full-band spectrogram, indicating that sub-band NNS can produce more robust speech features suitable for noisy speech recognition.
  • Keywords
    fast Fourier transforms; matrix decomposition; speech recognition; MFCC baseline; MFCC speech features; basis spectral matrices; full-band spectrogram; magnitude spectrogram; magnitude spectrum; noise components; noise robustness method; noise suppression; noise-corrupted environments; noise-corrupted testing data; noise-reduced magnitude spectrogram; noisy phase spectrogram; noisy speech recognition; nonnegative matrix factorization; robust speech features; robust speech recognition; short-time Fourier transform; spectrograms; speech data; speech recognition performance; speech recognition system; speech signals; subband spectrograms; time-domain signal; Signal to noise ratio; Spectrogram; Speech; Speech enhancement; Speech recognition; noise suppression; noise-robustness; nonnegative matrix factorization; speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Science, Electronics and Electrical Engineering (ISEEE), 2014 International Conference on
  • Conference_Location
    Sapporo
  • Print_ISBN
    978-1-4799-3196-5
  • Type

    conf

  • DOI
    10.1109/InfoSEEE.2014.6946219
  • Filename
    6946219