• DocumentCode
    3102268
  • Title

    Independent information from visual features for multimodal speech recognition

  • Author

    Gurbuz, Sabri ; Tufekci, Zekeriya ; Patterson, Eric ; Gowdy, John N.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Clemson Univ., SC, USA
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    221
  • Lastpage
    228
  • Abstract
    The performance of audio-based speech recognition systems degrades severely when there is a mismatch between training and usage environments due to background noise. This degradation is due to a loss of ability to extract and distinguish important information from audio features. One of the emerging techniques for dealing with this problem is the addition of visual features in a multimodal recognition system. This paper presents an affine-invariant, multimodal speech recognition system and focuses on the additional information that is available from video features. Results are presented that demonstrate the distinct information available from a visual subsystem that will allow optimal joint-decisions based on the SNR-ratio and type of noise to exceed either audio or video subsystem in nearly all noisy environments
  • Keywords
    acoustic noise; feature extraction; image recognition; speech recognition; video signal processing; SNR-ratio; affine-invariant multimodal speech recognition system; audio features; audio subsystem; audio-based speech recognition systems; background noise; multimodal recognition system; multimodal speech recognition; optimal joint-decisions; video features; visual features; visual subsystem; Acoustic noise; Automatic speech recognition; Background noise; Degradation; Feature extraction; Humans; Speech enhancement; Speech recognition; System performance; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    SoutheastCon 2001. Proceedings. IEEE
  • Conference_Location
    Clemson, SC
  • Print_ISBN
    0-7803-6748-0
  • Type

    conf

  • DOI
    10.1109/SECON.2001.923119
  • Filename
    923119