DocumentCode :
2172521
Title :
Stereophonic spectrogram segmentation using Markov random fields
Author :
Kim, Minje ; Smaragdis, Paris ; Ko, Glenn G. ; Rutenbar, Rob A.
Author_Institution :
Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
fYear :
2012
fDate :
23-26 Sept. 2012
Firstpage :
1
Lastpage :
6
Abstract :
There is a good amount of similarity between source separation approaches that use spectrograms captured from multiple microphones and computer vision algorithms that use multiple images for segmentation problems. Just as one would use Markov random fields (MRF) to solve image segmentation problems, we propose a method of modeling source separation using MRFs, and then solving such problems via common MRF inference methods. To this end, as a preprocessing, we convert stereophonic spectrograms into a integrated form based on their inter-channel level differences (ILD), which is a procedure analogous to getting a disparity map from stereo images for matching problems. Given the ILD matrix as an observed image, we estimate latent labels which stand for the responsibility of each spectrogram´s time/frequency bin to a specific sound source. It is shown that the proposed method shows reasonable separation performance in a variety of mixing environments including online separation and moving sources. We expect this new way of formulating source separation problems to help exploit advantages of probabilistic graphical models and the recent advances in low-power, high-performance hardware suited for such tasks.
Keywords :
Markov processes; acoustic generators; blind source separation; computer vision; image matching; image sampling; image segmentation; inference mechanisms; matrix algebra; source separation; stereo image processing; time-frequency analysis; ILD matrix; MRF; Markov random fields; common MRF inference methods; computer vision algorithms; disparity map; image matching problems; interchannel level differences; low-power high-performance hardware; mixing environments; moving sources; multiple images segmentation problems; multiple microphones; observed image; probabilistic graphical models; reasonable separation performance; source separation modeling; spectrogram time-frequency bin; stereo images; stereophonic spectrogram segmentation; Labeling; Markov processes; Microphones; Noise; Source separation; Spectrogram; Time frequency analysis; Blind Source Separation; Gibbs Sampling; Markov Random Fields; Probabilistic Graphical Model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning for Signal Processing (MLSP), 2012 IEEE International Workshop on
Conference_Location :
Santander
ISSN :
1551-2541
Print_ISBN :
978-1-4673-1024-6
Electronic_ISBN :
1551-2541
Type :
conf
DOI :
10.1109/MLSP.2012.6349754
Filename :
6349754
Link To Document :
بازگشت