DocumentCode :
1488032
Title :
Sequential Organization of Speech in Reverberant Environments by Integrating Monaural Grouping and Binaural Localization
Author :
Woodruff, John ; Wang, DeLiang
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
Volume :
18
Issue :
7
fYear :
2010
Firstpage :
1856
Lastpage :
1866
Abstract :
Existing binaural approaches to speech segregation place an exclusive burden on cues related to the location of sound sources in space. These approaches can achieve excellent performance in anechoic conditions but degrade rapidly in realistic environments where room reverberation corrupts localization cues. In this paper, we propose to integrate monaural and binaural processing to achieve segregation and localization of voiced speech in reverberant environments. The proposed approach builds on monaural analysis for simultaneous organization, and combines it with a novel method for generation of location-based cues in a probabilistic framework that jointly achieves localization and sequential organization. We compare localization performance to two existing methods, sequential organization performance to a model-based system that uses only monaural cues, and segregation performance to an exclusively binaural system. Results suggest that the proposed framework allows for improved source localization and robust segregation of voiced speech in environments with considerable reverberation.
Keywords :
reverberation; speech processing; anechoic conditions; binaural localization; localization performance; location-based cues; monaural grouping; reverberant environments; room reverberation; sequential organization; speech segregation; voiced speech; Array signal processing; Computer science; Degradation; Filtering; Image analysis; Reverberation; Robustness; Speech analysis; Speech processing; Time frequency analysis; Binaural speech segregation; computational auditory scene analysis; monaural grouping; sequential organization; sound localization;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2010.2050087
Filename :
5462949
Link To Document :
بازگشت