• DocumentCode
    1874272
  • Title

    A novel spectral subtraction scheme for robust speech recognition: spectral subtraction using spectral harmonics of speech

  • Author

    Beh, Jounghoon ; Ko, Hanseok

  • Author_Institution
    Dept. of Electron. & Comput. Eng., Korea Univ., Seoul, South Korea
  • Volume
    3
  • fYear
    2003
  • fDate
    6-9 July 2003
  • Abstract
    This paper addresses a novel noise-compensation scheme to solve the mismatch problem between training condition and testing condition for the automatic speech recognition (ASR) system, specifically in the car environments. The conventional spectral subtraction schemes rely on the signal to noise ratio (SNR) such that attenuation is imposed on that part of the spectrum that appears to have low SNR, and accentuation is made on that part of high SNR. However, since these schemes are based on the postulation that the power spectrum of noise is in general at the lower level in magnitude than that of speech. Therefore, while such postulation is adequate for high SNR environment, it is grossly inadequate for low SNR scenarios such as a car environment. This paper proposes an efficient spectral subtraction scheme focused to specifically low SNR noisy environments by distinguishing the speech-dominant segment from the nose-dominant segment in speech spectrum. Representative experiments confirm the superior performance of the proposed method over conventional methods. The experiments are conducted using car noise-corrupted utterances of Aurora2 corpus.
  • Keywords
    harmonics; noise; spectral analysis; speech processing; speech recognition; Aurora2 corpus; automatic speech recognition; car environments; noise power spectrum; noise-compensation scheme; noise-corrupted utterance; noise-corrupted utterances; nose-dominant segment; robust speech recognition; signal to noise ratio; spectral subtraction scheme; speech spectral harmonics; speech spectrum; speech-dominant segment; Attenuation; Automatic speech recognition; Automatic testing; Noise level; Noise robustness; Signal to noise ratio; Speech enhancement; Speech recognition; System testing; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia and Expo, 2003. ICME '03. Proceedings. 2003 International Conference on
  • Print_ISBN
    0-7803-7965-9
  • Type

    conf

  • DOI
    10.1109/ICME.2003.1221391
  • Filename
    1221391