مرکز منطقه ای اطلاع رساني علوم و فناوري - Binaural Detection, Localization, and Segregation in Reverberant Environments Based on Joint Pitch and Azimuth Cues

DocumentCode :

744142

Title :

Binaural Detection, Localization, and Segregation in Reverberant Environments Based on Joint Pitch and Azimuth Cues

Author :

Woodruff, Jonathan ; DeLiang Wang

Author_Institution :

Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA

Volume :

Issue :

fYear :

2013

fDate :

4/1/2013 12:00:00 AM

Firstpage :

806

Lastpage :

815

Abstract :

We propose an approach to binaural detection, localization and segregation of speech based on pitch and azimuth cues. We formulate the problem as a search through a multisource state space across time, where each multisource state encodes the number of active sources, and the azimuth and pitch of each active source. A set of multilayer perceptrons are trained to assign time-frequency units to one of the active sources in each multisource state based jointly on observed pitch and azimuth cues. We develop a novel hidden Markov model framework to estimate the most probable path through the multisource state space. An estimated state path encodes a solution to the detection, localization, pitch estimation and simultaneous organization problems. Segregation is then achieved with an azimuth-based sequential organization stage. We demonstrate that the proposed framework improves segregation relative to several two-microphone comparison systems that are based solely on azimuth cues. Performance gains are consistent across a variety of reverberant conditions.

Keywords :

Markov processes; multilayer perceptrons; reverberation; speech processing; active sources; azimuth cues; azimuth-based sequential organization stage; binaural detection; estimated state path encoding; hidden Markov model framework; multilayer perceptrons; multisource state-space; pitch cues; pitch estimation; reverberant environment localization; reverberant environment segregation; simultaneous organization problems; time-frequency units; two-microphone comparison systems; Acoustics; Azimuth; Estimation; Hidden Markov models; Joints; Organizations; Speech; Binaural speech segregation; computational auditory scene analysis; multipitch tracking; sound localization; source detection;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2012.2236316

Filename :

6392900

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=744142