Title :
Speech Recognition for a Humanoid with Motor Noise Utilizing Missing Feature Theory
Author :
Nishimura, Yoshitaka ; Ishizuka, Mitsuru ; Nakadai, Kazuhiro ; Nakano, Mikio ; Tsujino, Hiroshi
Author_Institution :
Graduate Sch. of Inf. Sci. & Technol., Tokyo Univ.
Abstract :
Automatic speech recognition (ASR) is essential for a human-humanoid communication. One of the main problems with ASR is that a humanoid inevitably generates motor noises. These noises are easily captured by the humanoid´s microphones because the noise sources are closer to the microphones than the target speech source. Thus, the signal-to-noise ratio (SNR) of input speech becomes quite low (sometimes less than 0 dB). However, it is possible to estimate these noises by using information about the humanoid´s own motions and gestures. In this paper we propose a method to improve ASR for a humanoid with motor noises by utilizing the information about the humanoid´s motions/gestures. The method consists of psychologically-inspired noise suppression and missing-feature-theory-based ASR (MFT-ASR). The proposed noise suppression technique adds white noise after noise suppression which does not improve SNR, but it is suitable for MFT-ASR. This is inspired by the fact that noise addition sometimes helps human perception as described in Gestalt psychology. MFT-ASR improves ASR by masking unreliable acoustic features in the input sound. The information obtained on motion/gesture is used for estimating reliability of acoustic features in MFT-ASR. We evaluated the proposed method with noisy speech recorded by Honda ASIMO in a room with reverberation. The noise data contained 32 kinds of noises: motor noises without motions, gesture noises, walking noises, and so on. The experimental results show that the proposed method outperforms the conventional multi-condition training technique.
Keywords :
feature extraction; humanoid robots; man-machine systems; signal denoising; speech recognition; white noise; Gestalt psychology; Honda ASIMO; acoustic features; automatic speech recognition; gesture noise; human-humanoid communication; humanoid microphones; humanoid robot; missing feature theory; motor noise; psychologically-inspired noise suppression; signal-to-noise ratio; target speech source; walking noise; white noise; Acoustic noise; Automatic speech recognition; Microphones; Motion estimation; Noise generators; Psychology; Signal to noise ratio; Speech enhancement; Speech recognition; White noise;
Conference_Titel :
Humanoid Robots, 2006 6th IEEE-RAS International Conference on
Conference_Location :
Genova
Print_ISBN :
1-4244-0200-X
Electronic_ISBN :
1-4244-0200-X
DOI :
10.1109/ICHR.2006.321359