Title :
Structured Sparsity Models for Reverberant Speech Separation
Author :
Asaei, Afsaneh ; Golbabaee, M. ; Bourlard, Herve ; Cevher, Volkan
Author_Institution :
Idiap Res. Inst., Martigny, Switzerland
Abstract :
We tackle the speech separation problem through modeling the acoustics of the reverberant chambers. Our approach exploits structured sparsity models to perform speech recovery and room acoustic modeling from recordings of concurrent unknown sources. The speakers are assumed to lie on a two-dimensional plane and the multipath channel is characterized using the image model. We propose an algorithm for room geometry estimation relying on localization of the early images of the speakers by sparse approximation of the spatial spectrum of the virtual sources in a free-space model. The images are then clustered exploiting the low-rank structure of the spectro-temporal components belonging to each source. This enables us to identify the early support of the room impulse response function and its unique map to the room geometry. To further tackle the ambiguity of the reflection ratios, we propose a novel formulation of the reverberation model and estimate the absorption coefficients through a convex optimization exploiting joint sparsity model formulated upon spatio-spectral sparsity of concurrent speech representation. The acoustic parameters are then incorporated for separating individual speech signals through either structured sparse recovery or inverse filtering the acoustic channels. The experiments conducted on real data recordings of spatially stationary sources demonstrate the effectiveness of the proposed approach for speech separation and recognition.
Keywords :
compressed sensing; convex programming; multipath channels; optimisation; reverberation chambers; speech recognition; absorption coefficients; acoustic channels; acoustic parameters; convex optimization; free space model; image model; inverse filtering; joint sparsity model; multipath channel; reverberant chambers; reverberant speech separation; room acoustic modeling; room geometry estimation; room impulse response function; sparse approximation; spatial spectrum; spectro temporal components; speech recovery; structured sparsity models; two dimensional plane; IEEE transactions; Microphones; Reverberation; Speech; Speech processing; Speech recognition; Distant speech recognition; image model; multi-party reverberant recordings; room acoustic modeling; source separation; structured sparse recovery;
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
DOI :
10.1109/TASLP.2013.2297012