• DocumentCode
    1488032
  • Title

    Sequential Organization of Speech in Reverberant Environments by Integrating Monaural Grouping and Binaural Localization

  • Author

    Woodruff, John ; Wang, DeLiang

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
  • Volume
    18
  • Issue
    7
  • fYear
    2010
  • Firstpage
    1856
  • Lastpage
    1866
  • Abstract
    Existing binaural approaches to speech segregation place an exclusive burden on cues related to the location of sound sources in space. These approaches can achieve excellent performance in anechoic conditions but degrade rapidly in realistic environments where room reverberation corrupts localization cues. In this paper, we propose to integrate monaural and binaural processing to achieve segregation and localization of voiced speech in reverberant environments. The proposed approach builds on monaural analysis for simultaneous organization, and combines it with a novel method for generation of location-based cues in a probabilistic framework that jointly achieves localization and sequential organization. We compare localization performance to two existing methods, sequential organization performance to a model-based system that uses only monaural cues, and segregation performance to an exclusively binaural system. Results suggest that the proposed framework allows for improved source localization and robust segregation of voiced speech in environments with considerable reverberation.
  • Keywords
    reverberation; speech processing; anechoic conditions; binaural localization; localization performance; location-based cues; monaural grouping; reverberant environments; room reverberation; sequential organization; speech segregation; voiced speech; Array signal processing; Computer science; Degradation; Filtering; Image analysis; Reverberation; Robustness; Speech analysis; Speech processing; Time frequency analysis; Binaural speech segregation; computational auditory scene analysis; monaural grouping; sequential organization; sound localization;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2010.2050087
  • Filename
    5462949