مرکز منطقه ای اطلاع رساني علوم و فناوري - Boosting-Based Multimodal Speaker Detection for Distributed Meetings

DocumentCode :

3275639

Title :

Boosting-Based Multimodal Speaker Detection for Distributed Meetings

Author :

Zhang, Cha ; Yin, Pei ; Rui, Yong ; Cutler, Ross ; Viola, Paul

Author_Institution :

Microsoft Res., Redmond, WA

fYear :

2006

fDate :

3-6 Oct. 2006

Firstpage :

Lastpage :

Abstract :

Speaker detection is a very important task in distributed meeting applications. This paper discusses a number of challenges we met while designing a speaker detector for the Microsoft RoundTable distributed meeting device, and proposes a boosting-based multimodal speaker detection (BMSD) algorithm. Instead of performing sound source localization (SSL) and multi-person detection (MPD) separately and subsequently fusing their individual results, the proposed algorithm uses boosting to select features from a combined pool of both audio and visual features simultaneously. The result is a very accurate speaker detector with extremely high efficiency. The algorithm reduces the error rate of SSL-only approach by 47%, and the SSL and MPD fusion approach by 27%

Keywords :

feature extraction; speaker recognition; BMSD algorithm; Microsoft RoundTable distributed meeting; audio features; boosting-based multimodal speaker detection; visual features; Acoustic signal detection; Boosting; Cameras; Detectors; Digital signal processing chips; Face detection; Head; Image resolution; Loudspeakers; Microphone arrays;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Multimedia Signal Processing, 2006 IEEE 8th Workshop on

Conference_Location :

Victoria, BC

Print_ISBN :

0-7803-9751-7

Electronic_ISBN :

0-7803-9752-5

Type :

conf

DOI :

10.1109/MMSP.2006.285274

Filename :

4064524

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3275639