Sparse Overcomplete Decomposition for Single Channel Speaker Separation

Author

Shashanka, M.V.S. ; Raj, Bhiksha ; Smaragdis, Paris

Author_Institution

Hearing Res. Center, Boston Univ., MA, USA

Volume

2

fYear

2007

fDate

15-20 April 2007

Abstract

We present an algorithm for separating multiple speakers from a mixed single channel recording. The algorithm is based on a model proposed by Raj and Smaragdis (2005). The idea is to extract certain characteristic spectra-temporal basis functions from training data for individual speakers and decompose the mixed signals as linear combinations of these learned bases. In other words, their model extracts a compact code of basis functions that can explain the space spanned by spectral vectors of a speaker. In our model, we generate a sparse-distributed code where we have more basis functions than the dimensionality of the space. We propose a probabilistic framework to achieve sparsity. Experiments show that the resulting sparse code better captures the structure in data and hence leads to better separation.

Keywords

source separation; speech processing; characteristic spectra-temporal basis functions; mixed single channel recording; single channel speaker separation; sparse overcomplete decomposition; sparse-distributed code; spectral vectors; Auditory system; Data mining; Entropy; Equations; Frequency; Graphical models; Random processes; Speech enhancement; Training data; Vectors; MAP estimation; Minimum entropy methods; Separation; Speech enhancement;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on

Conference_Location

Honolulu, HI

ISSN

1520-6149

Print_ISBN

1-4244-0727-3

Type

conf

DOI

10.1109/ICASSP.2007.366317

Filename

4217490