• DocumentCode
    134221
  • Title

    Single-channel dereverberation for distant-talking speech recognition by combining denoising autoencoder and temporal structure normalization

  • Author

    Ueda, Yuzuru ; Longbiao Wang ; Kai, Atsuhiko ; Xiong Xiao ; Eng Siong Chng ; Haizhou Li

  • Author_Institution
    Grad. Sch. of Eng., Shizuoka Univ., Hamamatsu, Japan
  • fYear
    2014
  • fDate
    12-14 Sept. 2014
  • Firstpage
    379
  • Lastpage
    383
  • Abstract
    In this paper, we propose a robust distant-talking speech recognition by combining cepstral domain denoising autoencoder (DAE) and temporal structure normalization (TSN) filter. For the proposed method, after applying a DAE in the cepstral domain of speech to suppress reverberation, we apply a post-processing technology based on temporal structure normalization (TSN) filter to reduce the noise and reverberation effects by normalizing the modulation spectra to reference spectra of clean speech. The proposed method was evaluated using speech in simulated and real reverberant environments. By combining a cepstral-domain DAE and TSN, the average Word Error Rate (WER) was reduced from 25.2% of the baseline system to 21.2% in simulated environments and from 47.5% to 41.3% in real environments, respectively.
  • Keywords
    cepstral analysis; filtering theory; reverberation; signal denoising; speech coding; speech recognition; TSN filter; WER; cepstral domain denoising autoencoder; cepstral-domain DAE; cepstral-domain TSN; clean speech reference spectra; distant-talking speech recognition; modulation spectra normalization; noise reduction; post-processing technology; reverberation effects reduction; reverberation suppression; single-channel dereverberation; temporal structure normalization; word error rate; Abstracts; Educational institutions; Indexes; Noise reduction; Robustness; Spectral analysis; Speech; denoising autoencoder; dereverberation; distant-talking speech; environment adaptation; speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
  • Conference_Location
    Singapore
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2014.6936613
  • Filename
    6936613