DocumentCode :
1721436
Title :
Computational Auditory Scene Analysis and its Application to Robot Audition
Author :
Okuno, Hiroshi G. ; Nakadai, Kazuhiro
Author_Institution :
Grad. Sch. of Inf., Kyoto Univ., Kyoto
fYear :
2008
Firstpage :
124
Lastpage :
127
Abstract :
Robot capability of hearing sounds, in particular, a mixture of sounds, by its own microphones, that is, robot audition, is important in improving human robot interaction. This paper presents the robot audition open-source software, called "HARK" (HRI-JP Audition for Robots with Kyoto University), which consists of primitive functions in computational auditory scene analysis; sound source localization, separation, and recognition of separated sounds. Since separated sounds suffer from spectral distortion due to separation, the HARK generates a time-spectral map of reliability, called "missing feature mask", for features of separated sounds. Then separated sounds are recognized by the missing-feature theory (MFT) based ASR with missing feature masks. The HARK is implemented on the middleware called "FlowDesigner" to share intermediate audio data, which enables near real-time processing.
Keywords :
hearing; microphone arrays; public domain software; robot programming; source separation; speech recognition; FlowDesigner middleware; HARK robot audition open-source software; MFT based ASR; automatic speech recognition; computational auditory scene analysis; human robot interaction; microphones; missing feature mask; missing-feature theory; separated sound recognition; sound source localization; sound source separation; Auditory system; Human robot interaction; Image analysis; Intelligent robots; Microphones; Multiple signal classification; Music; Open source software; Speech; Switches; Missing feature theory; computational auditory scene analysis; robot audition; simultaneous speakers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Hands-Free Speech Communication and Microphone Arrays, 2008. HSCMA 2008
Conference_Location :
Trento
Print_ISBN :
978-1-4244-2337-8
Electronic_ISBN :
978-1-4244-2338-5
Type :
conf
DOI :
10.1109/HSCMA.2008.4538702
Filename :
4538702
Link To Document :
بازگشت