• DocumentCode
    178814
  • Title

    Automatic Segmentation and Recognition of Human Actions in Monocular Sequences

  • Author

    Orrite, C. ; Rodriguez, M. ; Herrero, E. ; Rogez, G. ; Velastin, S.A.

  • Author_Institution
    Aragon Inst. of Eng. Res., Univ. of Zaragoza, Zaragoza, Spain
  • fYear
    2014
  • fDate
    24-28 Aug. 2014
  • Firstpage
    4218
  • Lastpage
    4223
  • Abstract
    This paper addresses the problem of silhouette-based human action segmentation and recognition in monocular sequences. Motion History Images (MHIs), used as 2D templates, capture motion information by encoding where and when motion occurred in the images. Inspired by codebook approaches for object and scene categorization, we first construct a codebook of temporal motion templates by clustering all the MHIs of each particular action. These MHIs capture different actors, speeds and a wide range of camera viewpoints. In this paper, we use a Kohonen´s Self-Organizing Map (SOM) to simultaneously cluster the MHI templates and represent them in lower dimensional subspaces. To cope with temporal segmentation, and concurrently carry out action recognition, a new architecture is proposed where the obsrvation MHIs are projected onto all these action-specific manifolds and the Euclidean distance between each MHI and the nearest cluster within each action-manifold constitutes the observation vector of a Markov Model. To estimate the state/action at each time step, we introduce a new method based on Observable Markov Models (OMMs) where the Markov model is augmented with a neutral state. The combination of our action-specific manifolds with the augmented OMM allows to automatically segment and recognize long sequences of consecutive actions, without any prior knowledge about initial and ending frames of each action. Importantly, our method allows to interpolate betweeen training viewpoint and recognizes actions, independently of the camera viewpoint, even from unseen viewpoints.
  • Keywords
    Markov processes; cameras; image coding; image recognition; image segmentation; image sequences; interpolation; motion estimation; self-organising feature maps; video signal processing; 2D templates; Euclidean distance; MHI template clustering; OMM; SOM; action estimation; action-specific manifolds; automatic silhouette-based human action recognition; automatic silhouette-based human action segmentation; camera viewpoint; information encoding; interpolation; low-dimensional subspace representation; monocular sequences; motion history images; motion information capture; nearest cluster; neutral state; observable Markov models; observation vector; self-organizing map; state estimation; temporal motion template codebook; temporal segmentation; training viewpoint; unseen viewpoints; Cameras; Hidden Markov models; Kernel; Manifolds; Markov processes; Motion segmentation; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition (ICPR), 2014 22nd International Conference on
  • Conference_Location
    Stockholm
  • ISSN
    1051-4651
  • Type

    conf

  • DOI
    10.1109/ICPR.2014.723
  • Filename
    6977435