DocumentCode :
1699713
Title :
Speaker localization in conferencing systems employing phase features and wavelet transform
Author :
Samborski, Rafal ; Ziolko, Mariusz
Author_Institution :
Dept. of Electron., AGH Univ. of Sci. & Technol., Kraków, Poland
fYear :
2013
Abstract :
Some of existing conference system employ a distant microphone array instead of microphones dedicated for each user. This approach is much more convenient although suffers from much higher noise sensitivity. One of the possible solutions is employing beamforming techniques to focus on the user that is speaking at the moment. However, beamformer needs information about the direction of arrival (DOA) parameter which is usually provided by analysing the phase differences between signals. Effectiveness of such solution decrease dramatically when the environment becomes noisy. In this paper, a novel, robust meetings diarization system is described. The decision about which user is speaking at the moment is based not only on spacial features of signal (i.e., speaker´s localization) but also on spectral features. The microphone array estimates speaker localization employing generalized cross-correlation with phase transform (GCC-PHAT). Additionally, the speaker recognition system which employs wavelet-Fourier transform (WFT) extracts spectral features of voice. Described solution is much more robust than the one basing on speaker recognition or speaker localization only. The experiments during meetings in regular meeting room show that it is less noise sensitive and the switching between speakers is several times faster.
Keywords :
Fourier transforms; array signal processing; correlation methods; direction-of-arrival estimation; feature extraction; microphone arrays; speaker recognition; spectral analysis; wavelet transforms; DOA parameter; GCC-PHAT; WFT; beamforming techniques; conferencing systems; direction of arrival parameter; distant microphone array; generalized cross-correlation; phase features; phase transform; robust meetings diarization system; signal spatial features; speaker localization; speaker recognition system; spectral feature extraction; wavelet transform; wavelet-Fourier transform; Microphones; Robustness; Three-dimensional displays; Wideband; microphone arrays; speaker localization; speaker recognition; wavelets;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing and Information Technology(ISSPIT), 2013 IEEE International Symposium on
Conference_Location :
Athens
Type :
conf
DOI :
10.1109/ISSPIT.2013.6781903
Filename :
6781903
Link To Document :
بازگشت