Design and implementation of a robot audition system for automatic speech recognition of simultaneous speech

Author

Shun´ichi Yamamoto; Kazuhiro Nakadai; Mikio Nakano; Hiroshi Tsujino; Jean-Marc Valin; Kazunori Komatani; Tetsuya Ogata;Hiroshi G. Okuno

Author_Institution

Graduate School of Informatics, Kyoto University, Yoshida-Honmachi, Sakyo-ku, 606-8501, Japan

fYear

2007

Firstpage

111

Lastpage

116

Abstract

This paper addresses robot audition that can cope with speech that has a low signal-to-noise ratio (SNR) in real time by using robot-embedded microphones. To cope with such a noise, we exploited two key ideas; Preprocessing consisting of sound source localization and separation with a microphone array, and system integration based on missing feature theory (MFT). Preprocessing improves the SNR of a target sound signal using geometric source separation with multichannel post-filter. MFT uses only reliable acoustic features in speech recognition and masks unreliable parts caused by errors in preprocessing. MFT thus provides smooth integration between preprocessing and automatic speech recognition. A real-time robot audition system based on these two key ideas is constructed for Honda ASIMO and Humanoid SIG2 with 8-ch microphone arrays. The paper also reports the improvement of ASR performance by using two and three simultaneous speech signals.

Keywords

"Robotics and automation","Automatic speech recognition","Acoustic noise","Intelligent robots","Microphone arrays","Working environment noise","Signal to noise ratio","Real time systems","Speech recognition","Source separation"

Publisher

ieee

Conference_Titel

Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on

Print_ISBN

978-1-4244-1745-2

Type

conf

DOI

10.1109/ASRU.2007.4430093

Filename

4430093