مرکز منطقه ای اطلاع رساني علوم و فناوري - Enhanced Robot Speech Recognition Based on Microphone Array Source Separation and Missing Feature Theory

DocumentCode :

2936317

Title :

Enhanced Robot Speech Recognition Based on Microphone Array Source Separation and Missing Feature Theory

Author :

Yamamoto, Shun´ichi ; Valin, Jean-Marc ; Nakadai, Kazuhiro ; Rouat, Jean ; Michaud, François ; Ogata, Tetsuya ; Okuno, Hiroshi G.

Author_Institution :

Graduate School of Informatics, Kyoto University, Kyoto, 606-8501 Japan shunichi@kuis.kyoto-u.ac.jp

fYear :

2005

fDate :

18-22 April 2005

Firstpage :

1477

Lastpage :

1482

Abstract :

A humanoid robot under real-world environments usually hears mixtures of sounds, and thus three capabilities are essential for robot audition; sound source localization, separation, and recognition of separated sounds. While the first two are frequently addressed, the last one has not been studied so much. We present a system that gives a humanoid robot the ability to localize, separate and recognize simultaneous sound sources. A microphone array is used along with a real-time dedicated implementation of Geometric Source Separation (GSS) and a multi-channel post-filter that gives us a further reduction of interferences from other sources. An automatic speech recognizer (ASR) based on the Missing Feature Theory (MFT) recognizes separated sounds in real-time by generating missing feature masks automatically from the post-filtering step. The main advantage of this approach for humanoid robots resides in the fact that the ASR with a clean acoustic model can adapt the distortion of separated sound by consulting the post-filter feature masks. Recognition rates are presented for three simultaneous speakers located at 2m from the robot. Use of both the post-filter and the missing feature mask results in an average reduction in error rate of 42% (relative).

Keywords :

Acoustic distortion; Automatic speech recognition; Error analysis; Humanoid robots; Interference; Loudspeakers; Microphone arrays; Source separation; Speech coding; Speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Robotics and Automation, 2005. ICRA 2005. Proceedings of the 2005 IEEE International Conference on

Print_ISBN :

0-7803-8914-X

Type :

conf

DOI :

10.1109/ROBOT.2005.1570323

Filename :

1570323

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2936317