Low complexity Bayesian single channel source separation

Author

Beierholm, T. ; Pedersen, Brian Dam ; Winther, Ole

Author_Institution

GN ReSound A/S, Tastrup, Denmark

Volume

5

fYear

2004

fDate

17-21 May 2004

Abstract

We propose a simple Bayesian model for performing single channel speech separation using factorized source priors in a sliding window linearly transformed domain. Using a one dimensional mixture of Gaussians to model each band source leads to fast tractable inference for the source signals. Simulations with separation of a male and a female speaker using priors trained on the same speakers show comparable performance with the blind separation approach of G.-J. Jang and T.-W. Lee (see NIPS, vol.15, 2003) with a SNR improvement of 4.9 dB for both the male and female speaker. Mixing coefficients can be estimated quite precisely using ML-II, but the estimation is quite sensitive to the accuracy of the priors as opposed to the source separation quality for known mixing coefficients, which is quite insensitive to the accuracy of the priors. Finally, we discuss how to improve our approach while keeping the complexity low using machine learning and CASA (computational auditory scene analysis) approaches (Jang and Lee, 2003; Roweis, S.T., 2001; Wang, D.L. and Brown, G.J., 1999; Hu, G. and Wang, D., 2003).

Keywords

Bayes methods; Gaussian processes; computational complexity; inference mechanisms; learning (artificial intelligence); maximum likelihood estimation; source separation; speech processing; 1D Gaussian mixture; Bayesian single channel source separation; blind separation; complexity; computational auditory scene analysis; factorized source priors; female speaker; machine learning; male speaker; mixing coefficient estimation; sliding window linearly transformed domain; speech separation; Bayesian methods; Content addressable storage; Discrete cosine transforms; Filters; Gaussian processes; Hidden Markov models; Machine learning; Maximum likelihood estimation; Source separation; Speech;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on

ISSN

1520-6149

Print_ISBN

0-7803-8484-9

Type

conf

DOI

10.1109/ICASSP.2004.1327164

Filename

1327164