DocumentCode
1488032
Title
Sequential Organization of Speech in Reverberant Environments by Integrating Monaural Grouping and Binaural Localization
Author
Woodruff, John ; Wang, DeLiang
Author_Institution
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
Volume
18
Issue
7
fYear
2010
Firstpage
1856
Lastpage
1866
Abstract
Existing binaural approaches to speech segregation place an exclusive burden on cues related to the location of sound sources in space. These approaches can achieve excellent performance in anechoic conditions but degrade rapidly in realistic environments where room reverberation corrupts localization cues. In this paper, we propose to integrate monaural and binaural processing to achieve segregation and localization of voiced speech in reverberant environments. The proposed approach builds on monaural analysis for simultaneous organization, and combines it with a novel method for generation of location-based cues in a probabilistic framework that jointly achieves localization and sequential organization. We compare localization performance to two existing methods, sequential organization performance to a model-based system that uses only monaural cues, and segregation performance to an exclusively binaural system. Results suggest that the proposed framework allows for improved source localization and robust segregation of voiced speech in environments with considerable reverberation.
Keywords
reverberation; speech processing; anechoic conditions; binaural localization; localization performance; location-based cues; monaural grouping; reverberant environments; room reverberation; sequential organization; speech segregation; voiced speech; Array signal processing; Computer science; Degradation; Filtering; Image analysis; Reverberation; Robustness; Speech analysis; Speech processing; Time frequency analysis; Binaural speech segregation; computational auditory scene analysis; monaural grouping; sequential organization; sound localization;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher
ieee
ISSN
1558-7916
Type
jour
DOI
10.1109/TASL.2010.2050087
Filename
5462949
Link To Document