• DocumentCode
    49233
  • Title

    Multi-Microphone Speech Dereverberation and Noise Reduction Using Relative Early Transfer Functions

  • Author

    Schwartz, Ofer ; Gannot, Sharon ; Habets, Emanuel A. P.

  • Author_Institution
    Fac. of Eng., Bar-Ilan Univ., Ramat-Gan, Israel
  • Volume
    23
  • Issue
    2
  • fYear
    2015
  • fDate
    Feb. 2015
  • Firstpage
    240
  • Lastpage
    251
  • Abstract
    In speech communication systems, the microphone signals are degraded by reverberation and ambient noise. The reverberant speech can be separated into two components, namely, an early speech component that includes the direct path and some early reflections, and a late reverberant component that includes all the late reflections. In this paper, a novel algorithm to simultaneously suppress early reflections, late reverberation and ambient noise is presented. A multi-microphone minimum mean square error estimator is used to obtain a spatially filtered version of the early speech component. The estimator constructed as a minimum variance distortionless response (MVDR) beamformer (BF) followed by a postfilter (PF). Three unique design features characterize the proposed method. First, the MVDR BF is implemented in a special structure, named the nonorthogonal generalized sidelobe canceller (NO-GSC). Compared with the more conventional orthogonal GSC structure, the new structure allows for a simpler implementation of the GSC blocks for various MVDR constraints. Second, In contrast to earlier works, RETFs are used in the MVDR criterion rather than either the entire RTFs or only the direct-path of the desired speech signal. An estimator of the RETFs is proposed as well. Third, the late reverberation and noise are processed by both the beamforming stage and the PF stage. Since the relative power of the noise and the late reverberation varies with the frame index, a computationally efficient method for the required matrix inversion is proposed to circumvent the cumbersome mathematical operation. The algorithm was evaluated and compared with two alternative multichannel algorithms and one single-channel algorithm using simulated data and data recorded in a room with a reverberation time of 0.5 s for various source-microphone array distances (1-4 m) and several signal-to-noise levels. The processed signals were tested using two commonly used objective measures, namely perceptual- evaluation of speech quality and log-spectral distance. As an additional objective measure, the improvement in word accuracy percentage of an acoustic speech recognition system is also demonstrated.
  • Keywords
    array signal processing; mean square error methods; microphone arrays; reverberation; speech recognition; speech synthesis; GSC blocks; MVDR BF; acoustic speech recognition system; ambient noise; beamformer; log-spectral distance; microphone signals; minimum variance distortionless response; multimicrophone minimum mean square error estimator; multimicrophone speech dereverberation; noise reduction; nonorthogonal generalized sidelobe canceller; orthogonal GSC structure; postfilter; reflections; relative early transfer functions; reverberant speech; reverberation time; signal-to-noise levels; single-channel algorithm; source-microphone array; speech communication systems; speech component; speech quality; speech signal; Microphones; Noise; Reverberation; Speech; Speech processing; Transfer functions; Dereverberation; generalized sidelobe canceller; minimum variance distortionless response (MVDR) beamforming; multichannel Wiener filter; relative transfer function;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2014.2372335
  • Filename
    6963314