DocumentCode :
2554287
Title :
Multi-modal front-end for speaker activity detection in small meetings
Author :
Even, Jani ; Heracleous, Panikos ; Ishi, Carlos ; Hagita, Norihiro
Author_Institution :
ATR Intelligent Robotics and Communication Laboratories, Kyoto, Japan
fYear :
2011
fDate :
25-30 Sept. 2011
Firstpage :
536
Lastpage :
541
Abstract :
Small informal meetings of two to four participants are very common in work environments. For this reason, a convenient way for recording and archiving these meetings is of great interest. In order to efficiently archive such meetings, an important task to address is to keep trace of “who talked when” during a meeting. This paper proposes a new multi-modal approach to tackle this speaker activity detection problem. One of the novelty of the proposed approach is that it uses a human tracker that relies on scanning laser range finders (LRFs) to localize the participants. This choice is especially relevant for robotic applications as robots are often equipped with LRFs for navigation purpose. In the proposed system, a table top microphone array in the center of the meeting room acquires the audio data while the LRF based human tracker monitors the movement of the participants. Then the speaker activity detection is performed using Gaussian mixture models that were trained before hand. An experiment reproducing a meeting configuration demonstrates the performance of the system for speaker activity detection. In particular, the proposed hands free system maintains an good level of performance compared to the use of close talking microphone while participants are simultaneously speaking.
Keywords :
Arrays; Humans; Interference; Joints; Microphones; Noise; Robots;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on
Conference_Location :
San Francisco, CA
ISSN :
2153-0858
Print_ISBN :
978-1-61284-454-1
Type :
conf
DOI :
10.1109/IROS.2011.6095051
Filename :
6095051
Link To Document :
بازگشت