DocumentCode :
1513795
Title :
Glimpsing IVA: A Framework for Overcomplete/Complete/Undercomplete Convolutive Source Separation
Author :
Masnadi-Shirazi, Alireza ; Zhang, Wenyi ; Rao, Bhaskar D.
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of California, La Jolla, CA, USA
Volume :
18
Issue :
7
fYear :
2010
Firstpage :
1841
Lastpage :
1855
Abstract :
Independent vector analysis (IVA) is a method for separating convolutedly mixed signals that significantly reduces the occurrence of the well-known permutation problem in frequency domain blind source separation (BSS). In this paper, we develop a novel IVA-based unifying framework for overcomplete/complete/undercomplete convolutive noisy BSS. We show that in order for the sources to be separable in the frequency domain, they must have a temporal dynamic structure. We exploit a common form of dynamics, especially present in speech, wherein the signals have silence periods intermittently, hence varying the set of active sources with time. This feature is extremely useful in dealing with overcomplete situations. An approach using hidden Markov models (HMMs) is proposed that takes advantage of different combinations of silence gaps of the source signals at each time period. This enables the algorithm to “glimpse” or listen in the gaps, hence compensating for the global degeneracy by allowing it to learn the mixing matrices at periods where it is locally less degenerate. The same glimpsing strategy can be employed to the complete/undercomplete case as well. Moreover, additive noise is considered in our model. Real and simulated experiments were carried out for overcomplete convoluted mixtures of speech signals yielding improved separation results compared to a sparsity-based robust time-frequency masking method. Signal-to-disturbance ratio (SDR) and machine intelligibility of a speech recognizer was used to evaluate their performances. Experiments were also conducted for the classical complete setting using the proposed algorithm and compared with standard IVA showing that the results compare favorably.
Keywords :
blind source separation; frequency-domain analysis; hidden Markov models; independent component analysis; matrix algebra; speech recognition; additive noise; complete convolutive source separation; frequency domain blind source separation; glimpsing IVA; hidden Markov model; independent component analysis; independent vector analysis; machine intelligibility; mixing matrices; overcomplete convolutive source separation; permutation problem; signal-to-disturbance ratio; sparsity-based robust time-frequency masking method; speech recognition; speech signal; temporal dynamic structure; undercomplete convolutive source separation; Additive noise; Blind source separation; Frequency domain analysis; Hidden Markov models; Noise robustness; Signal analysis; Source separation; Speech analysis; Speech coding; Time frequency analysis; Blind source separation (BSS); convolutive mixture; hidden Markov model (HMM); independent component analysis (ICA); independent vector analysis (IVA); overcomplete systems; speech recognition; underdetermined source separation;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2010.2052609
Filename :
5483158
Link To Document :
بازگشت