Title :
Online multimodal speaker detection for humanoid robots
Author :
Sanchez-Riera, Jordi ; Alameda-Pineda, Xavier ; Wienke, Johannes ; Deleforge, Antoine ; Arias, Sandra ; Cech, Jan ; Wrede, Sebastian ; Horaud, Radu
Author_Institution :
INRIA Grenoble Rhone-Alpes, Monbonnot, France
fDate :
Nov. 29 2012-Dec. 1 2012
Abstract :
In this paper we address the problem of audio-visual speaker detection. We introduce an online system working on the humanoid robot NAO. The scene is perceived with two cameras and two microphones. A multimodal Gaussian mixture model (mGMM) fuses the information extracted from the auditory and visual sensors and detects the most probable audio-visual object, e.g., a person emitting a sound, in the 3D space. The system is implemented on top of a platform-independent middleware and it is able to process the information online (17Hz). A detailed description of the system and its implementation are provided, with special emphasis on the on-line processing issues and the proposed solutions. Experimental validation, performed with five different scenarios, show that that the proposed method opens the door to robust human-robot interaction scenarios.
Keywords :
Gaussian processes; audio-visual systems; cameras; human-robot interaction; humanoid robots; microphones; middleware; sensor fusion; sensors; speaker recognition; NAO; audio-visual object detection; audio-visual speaker detection; auditory sensors; cameras; extracted information fusion; frequency 17 Hz; humanoid robots; mGMM; microphones; multimodal Gaussian mixture model; online information process; online multimodal speaker detection; platform-independent middleware; robust human-robot interaction scenarios; visual sensors; Cameras; Face; Microphones; Robots; Synchronization; Three-dimensional displays; Visualization;
Conference_Titel :
Humanoid Robots (Humanoids), 2012 12th IEEE-RAS International Conference on
Conference_Location :
Osaka
DOI :
10.1109/HUMANOIDS.2012.6651509