Title :
Simple auditory and visual features for human-robot dialog scene analysis
Author :
Yan, Rujiao ; Rodemann, Tobias ; Wrede, Britta
Author_Institution :
Res. Inst. for Cognition & Robot. (CoRLab), Bielefeld Univ., Bielefeld, Germany
Abstract :
This paper presents a system that uses various simple auditory and visual features to achieve human-robot dialog scene analysis. Our scene analysis system is able to learn how many speakers are in the scenario, where the speakers are and who is currently speaking. Speakers are unknown in advance. A visual short-term-memory (STM) helps to memorize persons, even if they disappear from the camera´s field of view for a while due to movements of persons or the robot head. In comparison to our previous work, we apply more visual features such as height, color and texture features of different upper body parts, to improve the scene representation performance. We show that our system is able to assign words to corresponding speakers. A speaker is recognized again when he leaves and enters the scene, or changes his position even with a newly appearing person.
Keywords :
audio-visual systems; control engineering computing; hearing; human-robot interaction; interactive systems; motion control; robot vision; speech processing; STM; auditory feature; camera field-of-view; color feature; computational audiovisual scene analysis; height feature; human-robot dialog scene analysis system; person movement; robot head movement; scene representation performance; speaker; texture feature; visual feature; visual short-term-memory; Cameras; Face; Hair; Image analysis; Robots; Vectors; Visualization;
Conference_Titel :
Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on
Conference_Location :
Vilamoura
Print_ISBN :
978-1-4673-1737-5
DOI :
10.1109/IROS.2012.6385534