Title :
Speech enhancement with sparse coding in learned dictionaries
Author :
Sigg, Christian D. ; Dikk, Tomas ; Buhmann, Joachim M.
Author_Institution :
Dept. of Comput. Sci., ETH Zurich, Zurich, Switzerland
Abstract :
The enhancement of speech degraded by non-stationary interferers is a highly relevant and difficult task of many signal processing applications. We present a monaural speech enhancement method based on sparse coding of noisy speech signals in a composite dictionary, consisting of the concatenation of a speech and interferer dictionary, both being possibly over-complete. The speech dictionary is learned off-line on a training corpus, while an environment specific interferer dictionary is learned on-line during speech pauses. Our approach optimizes the trade-off between source distortion and source confusion, and thus achieves significant improvements on objective quality measures like cepstral distance, in the speaker dependent and independent case, in several real-world environments and at low signal-to-noise ratios. Our enhancement method outperforms state-of-the-art methods like multi-band spectral subtraction and approaches based on vector quantization.
Keywords :
Fourier transforms; speech coding; speech enhancement; vector quantisation; cepstral distance; environment specific interferer dictionary; learned dictionaries; monaural speech enhancement method; noisy speech signal coding; signal processing applications; sparse coding; speech dictionary; vector quantization; Cepstral analysis; Degradation; Dictionaries; Distortion measurement; Signal processing; Signal to noise ratio; Speech coding; Speech enhancement; Speech processing; Working environment noise; Dictionary Learning; Source Separation; Sparse Coding; Speech Enhancement;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5495157