مرکز منطقه ای اطلاع رساني علوم و فناوري - Multi-channel source separation by factorial HMMs

DocumentCode :

3444770

Title :

Multi-channel source separation by factorial HMMs

Author :

Reyes-Gomez, Manuel J. ; Raj, Bhiksha ; Ellis, Daniel P W

Author_Institution :

Dept. of Electr. Eng., Columbia Univ., New York, NY, USA

Volume :

fYear :

2003

fDate :

6-10 April 2003

Abstract :

We present a new speaker-separation algorithm for separating signals with known statistical characteristics from mixed multi-channel recordings. Speaker separation has conventionally been treated as a problem of blind source separation (BSS). This approach does not utilize any knowledge of the statistical characteristics of the signals to be separated, relying mainly on the independence between the various signals to separate them. We present an algorithm that utilizes detailed statistical information about the signals to be separated, represented in the form of hidden Markov models (HMM). We treat the signal separation problem as one of beamforming, where each signal is extracted using a filter-and-sum array. The filters are estimated to maximize the likelihood of the summed output, measured on the HMM for the desired signal. This is done by iteratively estimating the best state sequence through the HMM from a factorial HMM (FHMM) that is the cross-product of the HMMs for the multiple signals, using the current output of the array, and estimating the filters to maximize the likelihood of that state sequence. Experiments show that the proposed method can cleanly extract a background speaker who is 20 dB below the foreground speaker in a two-speaker mixture, when the HMMs for the signals are constructed from knowledge of the utterance transcriptions.

Keywords :

array signal processing; filtering theory; hidden Markov models; iterative methods; maximum likelihood estimation; source separation; speech processing; beamforming; blind source separation; factorial HMM; filter-and-sum array; hidden Markov models; iterative estimation; maximum likelihood estimation; multi-channel source separation; signal extraction; signal separation; speaker-separation algorithm; statistical characteristics; utterance transcriptions; Blind source separation; Data mining; Filters; Hidden Markov models; Independent component analysis; Laboratories; Microphones; Source separation; Speech; State estimation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on

ISSN :

1520-6149

Print_ISBN :

0-7803-7663-3

Type :

conf

DOI :

10.1109/ICASSP.2003.1198868

Filename :

1198868

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3444770