DocumentCode
3347896
Title
Low complexity Bayesian single channel source separation
Author
Beierholm, T. ; Pedersen, Brian Dam ; Winther, Ole
Author_Institution
GN ReSound A/S, Tastrup, Denmark
Volume
5
fYear
2004
fDate
17-21 May 2004
Abstract
We propose a simple Bayesian model for performing single channel speech separation using factorized source priors in a sliding window linearly transformed domain. Using a one dimensional mixture of Gaussians to model each band source leads to fast tractable inference for the source signals. Simulations with separation of a male and a female speaker using priors trained on the same speakers show comparable performance with the blind separation approach of G.-J. Jang and T.-W. Lee (see NIPS, vol.15, 2003) with a SNR improvement of 4.9 dB for both the male and female speaker. Mixing coefficients can be estimated quite precisely using ML-II, but the estimation is quite sensitive to the accuracy of the priors as opposed to the source separation quality for known mixing coefficients, which is quite insensitive to the accuracy of the priors. Finally, we discuss how to improve our approach while keeping the complexity low using machine learning and CASA (computational auditory scene analysis) approaches (Jang and Lee, 2003; Roweis, S.T., 2001; Wang, D.L. and Brown, G.J., 1999; Hu, G. and Wang, D., 2003).
Keywords
Bayes methods; Gaussian processes; computational complexity; inference mechanisms; learning (artificial intelligence); maximum likelihood estimation; source separation; speech processing; 1D Gaussian mixture; Bayesian single channel source separation; blind separation; complexity; computational auditory scene analysis; factorized source priors; female speaker; machine learning; male speaker; mixing coefficient estimation; sliding window linearly transformed domain; speech separation; Bayesian methods; Content addressable storage; Discrete cosine transforms; Filters; Gaussian processes; Hidden Markov models; Machine learning; Maximum likelihood estimation; Source separation; Speech;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-8484-9
Type
conf
DOI
10.1109/ICASSP.2004.1327164
Filename
1327164
Link To Document