مرکز منطقه ای اطلاع رساني علوم و فناوري - Two-microphone source separation algorithm based on statistical modeling of angle distributions

DocumentCode :

3164706

Title :

Two-microphone source separation algorithm based on statistical modeling of angle distributions

Author :

Kim, Chanwoo ; Khawand, Charbel ; Stern, Richard M.

Author_Institution :

Windows Phone Div., Microsoft Corp., Redmond, WA, USA

fYear :

2012

fDate :

25-30 March 2012

Firstpage :

4629

Lastpage :

4632

Abstract :

In this paper we present a novel two-microphone sound source separation algorithm, which selects speech from the target speaker while suppressing signals from interfering sources. In this algorithm, which is refered to as SMAD-CW, we first estimate the direction of sound sources for each time-frequency bin using phase differences in the spectral domain. For each frame we assume that the angle distribution is a mixture of two distributions, one from the target and the other from the dominant noise source. For each mixture component we use the von Mises distribution, which is a close approximation to the wrapped normal distribution. The expectation-maximization (EM) algorithm is employed to obtain parameters of this mixture distribution. Using this statistical model, we perform maximum a posteriori (MAP) hypothesis testing in order to obtain appropriate binary masks. We demonstrate that the algorithm described in this paper provides speech recognition accuracy that is significantly better than that obtained using conventional approaches.

Keywords :

expectation-maximisation algorithm; microphones; source separation; speech recognition; SMAD-CW; angle distributions; binary masks; dominant noise source; expectation maximization algorithm; interfering sources; maximum a posteriori hypothesis testing; mixture distribution; phase differences; sound sources; speech recognition accuracy; statistical modeling; target speaker; time frequency bin; two microphone sound source separation algorithm; von Mises distribution; Abstracts; Robustness; Speech; Robust speech recognition; binaural hearing; interaural time difference; signal separation; statistical modeling; von Mises distribution;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

Conference_Location :

Kyoto

ISSN :

1520-6149

Print_ISBN :

978-1-4673-0045-2

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2012.6288950

Filename :

6288950

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3164706