DocumentCode
383140
Title
Auditory fovea based speech separation and its application to dialog system
Author
Nakadai, Kazuhiro ; Okuno, Hiroshi G. ; Kitano, Hiroaki
Author_Institution
Kitano Symbiotic Syst. Project, Japan Sci. & Technol. Corp, Shibuya-ku, Japan
Volume
2
fYear
2002
fDate
2002
Firstpage
1320
Abstract
This paper presents an active direction-pass filter (ADPF) that separates sounds originating from the specified direction by using a pair of microphones. Its application to front-end processing for speech recognition is also reported. Since the performance of sound source separation by the ADPF depends on the accuracy of sound source localization (direction), various localization modules including the interaural phase difference, interaural intensity difference for each sub-band, and other visual and auditory processing are integrated hierarchically. The resulting performance of auditory localization varies according to the relative position of the sound source. The resolution of the center of the robot is much higher than that of peripherals, indicating similar property of visual fovea. To make the best use of this property, the ADPF controls the direction of a head by motor movement. In order to recognize sound streams separated by the ADPF, a hidden Markov model based automatic speech recognition is built with multiple acoustic models trained by the output of the ADPF under different conditions. A preliminary dialog system is thus implemented on an upper-torso humanoid. The experimental results prove that it works well even when two speakers speak simultaneously.
Keywords
active filters; computer vision; filtering theory; interactive systems; real-time systems; robots; speech recognition; target tracking; active direction-pass filter; auditory fovea; hidden Markov model; human tracking system; humanoid robot; interaural intensity difference; interaural phase difference; mobile robots; real-time system; sound source localization; speech recognition; stereo vision; Active filters; Automatic speech recognition; Hidden Markov models; Humans; Microphones; Mobile robots; Robot sensing systems; Robotics and automation; Source separation; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Robots and Systems, 2002. IEEE/RSJ International Conference on
Print_ISBN
0-7803-7398-7
Type
conf
DOI
10.1109/IRDS.2002.1043937
Filename
1043937
Link To Document