DocumentCode :
3275639
Title :
Boosting-Based Multimodal Speaker Detection for Distributed Meetings
Author :
Zhang, Cha ; Yin, Pei ; Rui, Yong ; Cutler, Ross ; Viola, Paul
Author_Institution :
Microsoft Res., Redmond, WA
fYear :
2006
fDate :
3-6 Oct. 2006
Firstpage :
86
Lastpage :
91
Abstract :
Speaker detection is a very important task in distributed meeting applications. This paper discusses a number of challenges we met while designing a speaker detector for the Microsoft RoundTable distributed meeting device, and proposes a boosting-based multimodal speaker detection (BMSD) algorithm. Instead of performing sound source localization (SSL) and multi-person detection (MPD) separately and subsequently fusing their individual results, the proposed algorithm uses boosting to select features from a combined pool of both audio and visual features simultaneously. The result is a very accurate speaker detector with extremely high efficiency. The algorithm reduces the error rate of SSL-only approach by 47%, and the SSL and MPD fusion approach by 27%
Keywords :
feature extraction; speaker recognition; BMSD algorithm; Microsoft RoundTable distributed meeting; audio features; boosting-based multimodal speaker detection; visual features; Acoustic signal detection; Boosting; Cameras; Detectors; Digital signal processing chips; Face detection; Head; Image resolution; Loudspeakers; Microphone arrays;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia Signal Processing, 2006 IEEE 8th Workshop on
Conference_Location :
Victoria, BC
Print_ISBN :
0-7803-9751-7
Electronic_ISBN :
0-7803-9752-5
Type :
conf
DOI :
10.1109/MMSP.2006.285274
Filename :
4064524
Link To Document :
بازگشت