• DocumentCode
    137737
  • Title

    Speech-based human-robot interaction robust to acoustic reflections in real environment

  • Author

    Gomez, Raquel ; Inoue, Ken ; Nakamura, Kentaro ; Mizumoto, Tetsuya ; Nakadai, Kazuhiro

  • Author_Institution
    Honda Res. Inst. Japan Ltd. Co., Wako, Japan
  • fYear
    2014
  • fDate
    14-18 Sept. 2014
  • Firstpage
    1367
  • Lastpage
    1373
  • Abstract
    Acoustic reflection inside an enclosed environment is detrimental to human-robot interaction. Reflection may manifest as phantom sources emanating from unknown directions. In effect, a single speaker may falsely manifest as multiple speakers to the robot audition system, impeding the robot´s ability to correctly associate the speech command to the actual speaker. Moreover, speech reflection smears the original speech signal due to reverberation. This degrades speech recognition and understanding performance. Conventional robot audition schemes that rely purely on acoustics and spatial information are very sensitive to acoustic reflection which ultimately leads to the failure in human-robot interaction. We propose a method for human-robot interaction robust to the effect of acoustic reflection. First, visual information is utilized and head tracking scheme is employed to reinforce the acoustic information with the visual presence of a prospect user. Second, we employ a model-based sound event identification scheme and scrutinize whether the acoustic information is likely to be speech or non-speech. Using all the information we have gathered, we create a simple rule construct to effectively discriminate the original source (actual speaker) from phantom sources (reflection). Consequently, the corresponding source identified as phantom (reflection) is used to estimate the unwanted smearing for effective suppression via speech enhancement. Experiments are conducted in human-robot interaction setting in which the proposed method outperforms the conventional method.
  • Keywords
    human-robot interaction; speech processing; acoustic information; acoustic reflection; head tracking scheme; model-based sound event identification scheme; phantom sources; robot audition system; spatial information; speech command; speech enhancement; speech recognition; speech reflection; speech understanding; speech-based human-robot interaction; visual information; Acoustics; Microphones; Phantoms; Robot sensing systems; Speech; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Robots and Systems (IROS 2014), 2014 IEEE/RSJ International Conference on
  • Conference_Location
    Chicago, IL
  • Type

    conf

  • DOI
    10.1109/IROS.2014.6942735
  • Filename
    6942735