DocumentCode
463711
Title
Sparse Overcomplete Decomposition for Single Channel Speaker Separation
Author
Shashanka, M.V.S. ; Raj, Bhiksha ; Smaragdis, Paris
Author_Institution
Hearing Res. Center, Boston Univ., MA, USA
Volume
2
fYear
2007
fDate
15-20 April 2007
Abstract
We present an algorithm for separating multiple speakers from a mixed single channel recording. The algorithm is based on a model proposed by Raj and Smaragdis (2005). The idea is to extract certain characteristic spectra-temporal basis functions from training data for individual speakers and decompose the mixed signals as linear combinations of these learned bases. In other words, their model extracts a compact code of basis functions that can explain the space spanned by spectral vectors of a speaker. In our model, we generate a sparse-distributed code where we have more basis functions than the dimensionality of the space. We propose a probabilistic framework to achieve sparsity. Experiments show that the resulting sparse code better captures the structure in data and hence leads to better separation.
Keywords
source separation; speech processing; characteristic spectra-temporal basis functions; mixed single channel recording; single channel speaker separation; sparse overcomplete decomposition; sparse-distributed code; spectral vectors; Auditory system; Data mining; Entropy; Equations; Frequency; Graphical models; Random processes; Speech enhancement; Training data; Vectors; MAP estimation; Minimum entropy methods; Separation; Speech enhancement;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location
Honolulu, HI
ISSN
1520-6149
Print_ISBN
1-4244-0727-3
Type
conf
DOI
10.1109/ICASSP.2007.366317
Filename
4217490
Link To Document