• DocumentCode
    2053334
  • Title

    A microphone array system integrating beamforming, feature enhancement, and spectral mask-based noise estimation

  • Author

    Yoshioka, Takuya ; Nakatani, Tomohiro

  • Author_Institution
    NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan
  • fYear
    2011
  • fDate
    May 30 2011-June 1 2011
  • Firstpage
    219
  • Lastpage
    224
  • Abstract
    This paper proposes a microphone array system that integrates beamforming, feature enhancement, and highly accurate noise feature model estimation based on spectral masking. Previously proposed methods for combining beamformers and single-channel post-filters estimate noise power spectra or noise features based only on spatial information acquired from multiple microphones. These methods suffer from low noise estimation accuracy when the available microphones are limited or when there are array calibration or steering vector estimation errors. By contrast, the proposed method estimates a noise feature model accurately in a highly adaptive way by capitalizing on both spatial information and the characteristics of speech. Specifically, the method leverages an inter-microphone phase difference model, a clean feature model, and a harmonicity-based spectral mask model for the accurate estimation of spectral masks, each of which indicates the presence or absence of speech at a particular frequency bin. The estimated spectral masks are used to obtain the time-varying noise feature model. Results of a digit recognition experiment prove that the proposed system significantly outperforms an existing microphone array system combining a beamformer and a post-filter.
  • Keywords
    array signal processing; filtering theory; microphone arrays; speech recognition; array calibration; beamforming integration; feature enhancement; harmonicity-based spectral mask model; intermicrophone phase difference model; microphone array system; single-channel post-filters; spectral mask-based noise estimation; speech recognition systems; steering vector estimation errors; time-varying noise feature model; Adaptation models; Estimation; Feature extraction; Indexes; Microphone arrays; Noise; Microphone array; beamformer; noise estimation; robust speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Hands-free Speech Communication and Microphone Arrays (HSCMA), 2011 Joint Workshop on
  • Conference_Location
    Edinburgh
  • Print_ISBN
    978-1-4577-0997-5
  • Type

    conf

  • DOI
    10.1109/HSCMA.2011.5942402
  • Filename
    5942402