Title :
Dereverberation robust to speaker´s azimuthal orientation in multi-channel human-robot communication
Author :
Gomez, Raquel ; Nakamura, Kentaro ; Nakadai, Kazuhiro
Author_Institution :
Honda Res. Inst. Japan Ltd. Co., Wako, Japan
Abstract :
The acoustical dynamics of reverberation in an enclosed environment poses a problem to human-robot communication. Any change in the azimuthal orientation of the speaker contributes to unpredictable acoustical activity resulting in a degradation in the performance of the automatic speech recognition (ASR) system. Thus, dereverberation techniques need to address this issue prior to ASR. Dereverberation in multi-channel applications primarily evolves in the adoption of a suitable reverberant model that results to a computationally feasible solution and at the same time yields an accurate estimate of the harmful reflections (i.e., late reflection) for effective suppression. In this paper we address this problem by introducing a hybrid method based on multi-channel processing on a singlechannel reverberant model platform. The proposed method is capable of accurate signal estimation, a property inherent to a multi-channel system, and at the same time bears the computational efficiency derived from single-channel reverberant model approach. The proposed method is summarized as follows; First, multi-channel sound-source processing is employed to obtain the full reverberant and the late reflection signal estimates. Then, equalization is employed to update the late reflection estimate reflective of the change in azimuth prior to dereverberation. The equalization parameters for azimuthal change are obtained through an offline optimization procedure. Experimental evaluation in an actual human-robot communication environment shows that the proposed method outperforms existing methods in terms of robustness in the ASR performance.
Keywords :
acoustic signal processing; human-robot interaction; reverberation; speaker recognition; speech processing; ASR; automatic speech recognition system; azimuthal change; computational efficiency; dereverberation; multichannel human-robot communication; multichannel sound-source processing; offline optimization procedure; reverberation acoustical dynamics; signal estimation; single-channel reverberant model platform; speaker azimuthal orientation; speech-based human-robot interaction; unpredictable acoustical activity; Azimuth; Computational modeling; Microphones; Reverberation; Robots; Speech; Training;
Conference_Titel :
Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ International Conference on
Conference_Location :
Tokyo
DOI :
10.1109/IROS.2013.6696846