• DocumentCode
    1253840
  • Title

    A review of speech-based bimodal recognition

  • Author

    Chibelushi, Claude C. ; Deravi, Farzin ; Mason, John S D

  • Author_Institution
    Sch. of Comput., Staffordshire Univ., Stafford, UK
  • Volume
    4
  • Issue
    1
  • fYear
    2002
  • fDate
    3/1/2002 12:00:00 AM
  • Firstpage
    23
  • Lastpage
    37
  • Abstract
    Speech recognition and speaker recognition by machine are crucial ingredients for many important applications such as natural and flexible human-machine interfaces. Most developments in speech-based automatic recognition have relied on acoustic speech as the sole input signal, disregarding its visual counterpart. However, recognition based on acoustic speech alone can be afflicted with deficiencies that preclude its use in many real-world applications, particularly under adverse conditions. The combination of auditory and visual modalities promises higher recognition accuracy and robustness than can be obtained with a single modality. Multimodal recognition is therefore acknowledged as a vital component of the next generation of spoken language systems. The paper reviews the components of bimodal recognizers, discusses the accuracy of bimodal recognition, and highlights some outstanding research issues as well as possible application domains
  • Keywords
    audio-visual systems; bibliographies; image processing; multimedia computing; speech recognition; acoustic speech; adverse conditions; audio-visual fusion; auditory modalities; bimodal recognizers; joint media processing; multimodal recognition; natural flexible human-machine interfaces; real-world applications; recognition accuracy; speaker recognition; speech recognition; speech-based automatic recognition; speech-based bimodal recognition; spoken language systems; visual modalities; Acoustic applications; Application software; Auditory system; Automatic speech recognition; Computer interfaces; Data mining; Man machine systems; Robustness; Speaker recognition; Speech recognition;
  • fLanguage
    English
  • Journal_Title
    Multimedia, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1520-9210
  • Type

    jour

  • DOI
    10.1109/6046.985551
  • Filename
    985551