DocumentCode :
1603866
Title :
Audiovisual speaker localization in medium smart meeting room
Author :
Ronzhin, Andrey ; Ronzhin, Alexander ; Budkov, Viktor
Author_Institution :
Speech & Multimodal Interfaces Lab., St. Petersburg Inst. for Inf. & Autom., St. Petersburg, Russia
fYear :
2011
Firstpage :
1
Lastpage :
5
Abstract :
The issue of automatic selection of the current active speaker among more than thirty participants located in the medium-sized meeting room is considered. Techniques of video tracking and sound source localization are implemented for recording AVI files of speaker remarks in the developed smart meeting room. Video processing of streams from five cameras serves for registration of participants in fixed chair positions, tracking main speaker based on histogram comparison and AdaBoosted cascade classifier for face detection. Multichannel sound source localization based on GCC-PHAT method is used for estimation of the speaker position by four microphone arrays. In the 18dB SNR case the sound source localization rate was about 97% and fine RMSE was lower 0.23 m.
Keywords :
audio-visual systems; face recognition; microphone arrays; AdaBoosted cascade classifier; active speaker; audiovisual speaker localization; automatic selection; face detection; medium smart meeting room; medium-sized meeting room; microphone arrays; multichannel sound source localization; video processing; video tracking; Arrays; Cameras; Estimation; Microphone arrays; Signal to noise ratio; Speech; microphone array; smart meeting room; sound source localization; speaker detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information, Communications and Signal Processing (ICICS) 2011 8th International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-1-4577-0029-3
Type :
conf
DOI :
10.1109/ICICS.2011.6173618
Filename :
6173618
Link To Document :
بازگشت