• DocumentCode
    384356
  • Title

    Integrated analysis of speech and images as a probabilistic decoding process

  • Author

    Wachsmuth, Sven ; Sagerer, Gerhard

  • Author_Institution
    Fac. of Technol., Bielefeld Univ., Germany
  • Volume
    2
  • fYear
    2002
  • fDate
    2002
  • Firstpage
    588
  • Abstract
    Speech understanding and vision are the two most important modalities in human-human communication. However, the emulation of these by a computer faces fundamental difficulties due to noisy data, vague meanings, previously unseen objects or unheard words, occlusions, spontaneous speech effects, and context dependence. Thus, the interpretation processes on both channels are highly error-prone. This paper presents a new perspective on the problem of relating speech and image interpretations as a probabilistic decoding process. It is shown that such an integration scheme is robust regarding partial or erroneous interpretations. Furthermore, it is shown that implicit error correction strategies can be formulated in this probabilistic framework that lead to improved scene interpretation.
  • Keywords
    belief networks; image coding; image recognition; probability; speech coding; speech recognition; user interfaces; Bayesian networks; error correction; human-computer interface; image recognition; natural-language; probabilistic decoding; probability distribution; scene interpretation; speech interpretations; speech recognition; Computer errors; Context; Decoding; Emulation; Error correction; Face detection; Image analysis; Robustness; Speech analysis; Speech processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 2002. Proceedings. 16th International Conference on
  • ISSN
    1051-4651
  • Print_ISBN
    0-7695-1695-X
  • Type

    conf

  • DOI
    10.1109/ICPR.2002.1048371
  • Filename
    1048371