• DocumentCode
    294687
  • Title

    Knowing who to listen to in speech recognition: visually guided beamforming

  • Author

    Bub, Udo ; Hunke, Martin ; Waibel, Alex

  • Author_Institution
    Carnegie Mellon Univ., Pittsburgh, PA, USA
  • Volume
    1
  • fYear
    1995
  • fDate
    9-12 May 1995
  • Firstpage
    848
  • Abstract
    With speech recognition systems steadily improving in performance, freedom from head-sets and push-buttons to activate the recognizer is one of the most important issues to achieve user acceptance. Microphone arrays and beamforming can deliver signals that suppress undesired jamming signals but rely on knowledge where the signal is in space. This knowledge is usually derived by identifying the loudest signal source. Knowing who is speaking to whom and where should however not depend on loudness, but on the communication purpose. In this paper, we present acoustic and visual modules that use tracking of the face of a speaker of interest for sound source localization and beamforming for signal extraction. It is shown that in noisy environments a more accurate localization in space can be delivered visually than acoustically. Given a reliable location finder, beamforming substantially improves recognition accuracy
  • Keywords
    interference suppression; jamming; speech recognition; acoustic modules; location finder; microphone arrays; noisy environments; performance; signal extraction; sound source localization; speech recognition; tracking; undesired jamming signal suppression; user acceptance; visual modules; visually guided beamforming; Acoustic beams; Acoustic noise; Acoustic sensors; Array signal processing; Delay; Jamming; Loudspeakers; Microphone arrays; Sensor arrays; Speech recognition; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
  • Conference_Location
    Detroit, MI
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-2431-5
  • Type

    conf

  • DOI
    10.1109/ICASSP.1995.479827
  • Filename
    479827