مرکز منطقه ای اطلاع رساني علوم و فناوري - Real-Time Robot Audition System That Recognizes Simultaneous Speech in The Real World

DocumentCode :

3190014

Title :

Real-Time Robot Audition System That Recognizes Simultaneous Speech in The Real World

Author :

Yamamoto, Shunichi ; Nakadai, Kazuhiro ; Nakano, Mikio ; Tsujino, Hiroshi ; Valin, JeanMarc ; Komatani, Kazunori ; Ogata, Tetsuya ; Okuno, Hiroshi G.

Author_Institution :

Graduate Sch. of Informatics, Kyoto Univ.

fYear :

2006

fDate :

9-15 Oct. 2006

Firstpage :

5333

Lastpage :

5338

Abstract :

This paper presents a robot audition system that recognizes simultaneous speech in the real world by using robot-embedded microphones. We have previously reported missing feature theory (MFT) based integration of sound source separation (SSS) and automatic speech recognition (ASR) for building robust robot audition. We demonstrated that a MFT-based prototype system drastically improved the performance of speech recognition even when three speakers talked to a robot simultaneously. However, the prototype system had three problems; being offline, hand-tuning of system parameters, and failure in voice activity detection (VAD). To attain online processing, we introduced FlowDesigner-based architecture to integrate sound source localization (SSL), SSS and ASR. This architecture brings fast processing and easy implementation because it provides a simple framework of shared-object-based integration. To optimize the parameters, we developed genetic algorithm (GA) based parameter optimization, because it is difficult to build an analytical optimization model for mutually dependent system parameters. To improve VAD, we integrated new VAD based on a power spectrum and location of a sound source into the system, since conventional VAD relying only on power often fails due to low signal-to-noise ratio of simultaneous speech. We, then, constructed a robot audition system for Honda ASIMO. As a result, we showed that the system worked online and fast, and had a better performance in robustness and accuracy through experiments on recognition of simultaneous speech in a noisy and echoic environment

Keywords :

control engineering computing; robots; speech recognition; Honda ASIMO; missing feature theory; real-time robot audition system; shared-object-based integration; simultaneous speech recognition; sound source localization; sound source separation; voice activity detection; Automatic speech recognition; Loudspeakers; Microphones; Prototypes; Real time systems; Robotics and automation; Robots; Robustness; Source separation; Speech recognition; genetic algorithm; missing feature theory; parameter optimization; real-time processing; robot audition; voice activity detection;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Intelligent Robots and Systems, 2006 IEEE/RSJ International Conference on

Conference_Location :

Beijing

Print_ISBN :

1-4244-0258-1

Electronic_ISBN :

1-4244-0259-X

Type :

conf

DOI :

10.1109/IROS.2006.282037

Filename :

4059274

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3190014