• DocumentCode
    2380768
  • Title

    A multi-modal graphical model for robust recognition of group actions in meetings from disturbed videos

  • Author

    Al-Hames, Marc ; Rigoll, Gerhard

  • Author_Institution
    Inst. for Human-Machine Commun., Technische Univ. Munchen, Germany
  • Volume
    3
  • fYear
    2005
  • fDate
    11-14 Sept. 2005
  • Abstract
    In this work we present a novel multi-modal mixed-state dynamic Bayesian network (DBN) for robust meeting event classification from disturbed videos. The model uses information from the audio and the visual channel to structure meetings into segments. Within the DBN a multi-stream hidden Markov model (HMM) is coupled with a linear dynamical system (LDS) to compensate disturbances in the visual channel. Thereby the HMM is used as driving input for the LDS. Thus the model can handle noise and occlusions in the video. Experimental results on real meeting data show that the new model is highly preferable to all single-stream approaches. Compared to a baseline multi-modal early fusion HMM, the new DBN is 3.5%, respectively up to 6.1% better for clear and visual disturbed data, this corresponds to a relative error reduction of 23.6%, respectively 29.9%.
  • Keywords
    belief networks; hidden Markov models; image recognition; video streaming; disturbed videos; group action recognition; linear dynamical system; meeting event classification; multimodal graphical model; multimodal mixed-state dynamic Bayesian network; multistream hidden Markov model; Bayesian methods; Cameras; Graphical models; Hidden Markov models; Intelligent networks; Legged locomotion; Microphone arrays; Robustness; Speech analysis; Videos;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing, 2005. ICIP 2005. IEEE International Conference on
  • Print_ISBN
    0-7803-9134-9
  • Type

    conf

  • DOI
    10.1109/ICIP.2005.1530418
  • Filename
    1530418