DocumentCode
2948947
Title
Solving the indeterminations of blind source separation of convolutive speech mixtures
Author
Rivet, Bertrand ; Girin, Laurent ; Jutten, Christian
Author_Institution
Speech Commun. Inst., Grenoble Nat. Polytech. Inst., France
Volume
5
fYear
2005
fDate
18-23 March 2005
Abstract
Looking at the speaker´s face seems useful for hearing a speech signal better and extracting it from competing sources before identification. We present a novel algorithm plugging the audiovisual coherence of speech signals, estimated by statistical tools, on audio blind source separation (BSS) algorithms in the difficult case of convolutive mixtures. The algorithm mainly works in the frequency (transform) domain, where the convolutive mixture becomes an additive mixture for each frequency channel. Frequency by frequency separation is made by an audio BSS algorithm, and the audiovisual information is used to solve the standard source permutation and scale factor problems at the output of the separation stage, for each frequency. The proposed method is shown to be efficient in the case of 2×2 convolutive mixtures.
Keywords
audio signal processing; audio-visual systems; blind source separation; frequency-domain analysis; parameter estimation; speech processing; statistical analysis; video signal processing; additive mixture; audio BSS algorithm; audiovisual coherence; blind source separation indeterminations; convolutive speech mixtures; frequency domain; transform domain; Blind source separation; Coherence; Filters; Frequency; Oral communication; Signal processing; Signal processing algorithms; Source separation; Speech enhancement; Speech processing;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-8874-7
Type
conf
DOI
10.1109/ICASSP.2005.1416358
Filename
1416358
Link To Document