DocumentCode :
155685
Title :
Joint audio source separation and dereverberation based on multichannel factorial hidden Markov model
Author :
Higuchi, Tatsuro ; Kameoka, Hirokazu
Author_Institution :
Grad. Sch. of Inf. Sci. & Technol., Univ. of Tokyo, Tokyo, Japan
fYear :
2014
fDate :
21-24 Sept. 2014
Firstpage :
1
Lastpage :
6
Abstract :
This paper proposes a unified approach for jointly solving underdetermined source separation, audio event detection and dereverberation of convolutive mixtures. For monaural source separation, one successful approach involves applying non-negative matrix factorization (NMF) to the magnitude spectrogram of a mixture signal, interpreted as a non-negative matrix. Several attempts have recently been made to extend this approach to a multichannel case in order to utilize the spatial correlation of the multichannel inputs as an additional clue for source separation. The multichannel NMF assumes that an observed signal is a mixture of a limited number of source signals each of which has a static power spectral density scaled by a time-varying amplitude. We have previously proposed an extension of this approach, in which the variations over time of the spectral density and the total power of each source is modeled by a hidden Markov model (HMM). This has allowed us to solve source activity detection and source separation simultaneously through model parameter inference. While this method was based on an anechoic mixing model, the aim of this paper is to further extend the above approach to deal with reverberation by incorporating an echoic mixing model into the generative model of observed signals. Through an experiment of underdetermined source separation under reverberant conditions, we confirmed that the proposed method provided a 9.61 dB improvement compared with the conventional method in terms of the signal-to-interference ratio.
Keywords :
acoustic convolution; audio signal processing; blind source separation; echo; hidden Markov models; matrix decomposition; reverberation; signal detection; spectral analysis; time-varying channels; NMF; anechoic mixing model; audio event detection; audio source dereverberation; audio source separation; convolutive mixture dereverberation; echoic mixing model; mixture signal spectrogram; multichannel factorial hidden markov model; multichannel spatial correlation utilization; nonnegative matrix factorization; observed signal generative model; reverberation; signal-to-interference ratio; source activity detection; spectral density time variation; static power spectral density; time-varying amplitude; Arrays; Hidden Markov models; Microphones; Optimization; Source separation; Spectrogram; Time-frequency analysis; audio event detection; dereverberation; multichannel factorial hidden Markov model; non-negative matrix factorization; source separation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning for Signal Processing (MLSP), 2014 IEEE International Workshop on
Conference_Location :
Reims
Type :
conf
DOI :
10.1109/MLSP.2014.6958927
Filename :
6958927
Link To Document :
بازگشت