DocumentCode :
2684660
Title :
Blind Speech Separation in a Meeting Situation with Maximum SNR Beamformers
Author :
Araki, Shoko ; Sawada, Hiroshi ; Makino, Shoji
Author_Institution :
NTT Commun. Sci. Lab., NTT Corp., Tokyo
Volume :
1
fYear :
2007
fDate :
15-20 April 2007
Abstract :
We propose a speech separation method for a meeting situation, where each speaker sometimes speaks and the number of speakers changes every moment. Many source separation methods have already been proposed, however, they consider a case where all the speakers keep speaking: this is not always true in a real meeting. In such cases, in addition to separation, speech detection and the classification of the detected speech according to speaker become important issues. For that purpose, we propose a method that employs a maximum signal-to-noise (MaxSNR) beamformer combined with a voice activity detector and online clustering. We also discuss the scaling ambiguity problem as regards the MaxSNR beamformer, and provide their solutions. We report some encouraging results for a real meeting in a room with a reverberation time of about 350 ms.
Keywords :
blind source separation; speaker recognition; speech processing; blind speech separation; maximum SNR beamformers; maximum signal-to-noise; meeting situation; online clustering; source separation methods; speech classification; speech detection; voice activity detector; Fourier transforms; Frequency conversion; Frequency response; Information science; Interference; Proposals; Reverberation; Speech; Time frequency analysis; Wideband; Speech separation; maximum SNR beamformer; online clustering; scaling ambiguity; voice activity detector;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location :
Honolulu, HI
ISSN :
1520-6149
Print_ISBN :
1-4244-0727-3
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2007.366611
Filename :
4217011
Link To Document :
بازگشت