Title :
Temporal smearing compensation in reverberant environment for speech-based human-robot interaction
Author :
Gomez, Randy ; Nakamura, Keisuke ; Mizumoto, Takeshi ; Nakadai, Kazuhiro
Author_Institution :
Honda Res. Inst. Japan Ltd. Co., Wako, Japan
Abstract :
Speech-based human-robot interaction is often plagued with issues such as reverberation and changes in speaker position that impacts overall performance. In this paper, we show a method in compensating the joint effects of reverberation and the change in speaker position. The acoustic perturbation caused by these two takes its toll on the Automatic Speech Recognition (ASR) and then the Spoken Language Understanding (SLU). Consequently, these will lead to a failure in the human-robot interaction experience. The proposed method is specifically designed to address the challenging environment condition in which robots are deployed. First, we analyze the impact of reverberation in the form of temporal smearing per change in speaker position. Then, we extract the smearing coefficients that capture the joint dynamics between the speech signal at current position and the room acoustics as observed by the robot. These coefficients are utilized to update the room transfer function (RTF) and the suppression parameters are stored offline. Moreover, all of these processes are optimized in the context of the ASR system for robot application. In the online mode, the reverberant data at an arbitrary position is processed using the parameters pre-computed offline. This effectively compensates the joint effects of reverberation at the arbitrary speaker position. Experimental results using real data gathered in a human-robot communication setting show that the proposed method outperforms existing methods.
Keywords :
acoustic signal processing; human-robot interaction; perturbation techniques; reverberation; speech recognition; speech-based user interfaces; ASR system; RTF; SLU; acoustic perturbation; automatic speech recognition; joint dynamics; online mode; reverberant environment; room transfer function; smearing coefficients; speaker position; speech-based human-robot interaction; spoken language understanding; suppression parameters; temporal smearing compensation; Databases; Hidden Markov models; Joints; Reverberation; Robots; Speech; Automatic Speech Recognition; Dereverberation; Robustness; Speech Enhancement;
Conference_Titel :
Robotics and Automation (ICRA), 2015 IEEE International Conference on
Conference_Location :
Seattle, WA
DOI :
10.1109/ICRA.2015.7139661