Title :
Towards Human-Like Production and Binaural Localization of Speech Sounds in Humanoid Robots
Author :
Wolff, Robert ; Lasseck, Mario ; Hild, Manfred ; Vilarroya, Oscar ; Hadzibeganovic, Tarik
Author_Institution :
Labor fur Neurorobotik, Humboldt-Univ. zu Berlin, Berlin, Germany
Abstract :
We present a prototype of a humanoid robot head equipped with human-like speech sound localization and production systems designed for a new generation of robots that should autonomously evolve language and other cognitive skills. Similarly to the human auditory apparatus, the robot head contains a binaural sensor system based upon a frequency domain binaural model. This enables the robot to detect and locate the speaker autonomously on the basis of the produced speech signals. However, the temporal regularity of incoming sounds is in humans analyzed on different time scales, with the millisecond range giving rise to the sensation of pitch and the periods on the order of seconds giving rise to the sensation of rhythm. In addition, unlike for humans, detecting and localizing multiple sound signals is a rather nontrivial problem for machine audition. We therefore discuss a possible implementation of human-like spatiotemporal processing of sounds in single and multisource scenarios. Our future goals are to adequately combine the constructed speech synthesis and physical audio systems, and to establish an algorithm for detailed spatiotemporal localization of both single and concurrent speech sound sources, with roughly human-like temporal and spatial processing capabilities.
Keywords :
humanoid robots; sensors; speech processing; speech synthesis; binaural sensor system; cognitive skills; concurrent speech sound source; constructed speech synthesis; frequency domain binaural model; human auditory apparatus; human-like spatiotemporal sound processing; human-like speech sound localization; humanoid robot head; language skills; physical audio systems; single speech sound source; Cognitive robotics; Humanoid robots; Humans; Magnetic heads; Natural languages; Production systems; Prototypes; Robot sensing systems; Spatiotemporal phenomena; Speech synthesis;
Conference_Titel :
Bioinformatics and Biomedical Engineering , 2009. ICBBE 2009. 3rd International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-2901-1
Electronic_ISBN :
978-1-4244-2902-8
DOI :
10.1109/ICBBE.2009.5163693