• DocumentCode
    1997634
  • Title

    Comparing alternatives for capturing dynamic information in Bag-of-Visual-Features approaches applied to human actions recognition

  • Author

    Lopes, Ana Paula B ; Oliveira, Rodrigo Silva ; de Almeida, Jussara M. ; de A Araujo, Arnaldo

  • Author_Institution
    Exact & Technol. Sci. Dept., State Univ. of Santa Cruz, Santa Cruz, Brazil
  • fYear
    2009
  • fDate
    5-7 Oct. 2009
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Bag-of-Visual-Features (BoVF) representations have achieved a great success when used for object recognition, mainly because of their robustness to several kinds of variations and occlusion. Recently, a number of BoVF approaches has been proposed also for recognition of human actions from videos. One important issue that arises when using BoVF for videos is how to take dynamic information into account, and most proposals rely on 3D extensions of 2D visual descriptors for this. However, we envision alternative approaches based on 2D descriptors applied to the spatio-temporal video planes, instead of to the traditionally explored by previous work. Thus, in this paper, we address the following question: what is the cost-effectiveness of a BoVF approach built from such 2D descriptors when compared to one based on the state-of-the-art 3D Spatio-Temporal Interest Points (STIPs) descriptor? We evaluate the recognition rate and time complexity of alternative 2D descriptors applied to different sets of spatio-temporal planes, and the state-of-the-art STIPs. Experimental results show that, with proper settings, 2D descriptors can yield the same recognition results as those provided by STIP, but at a significantly higher time complexity.
  • Keywords
    feature extraction; video signal processing; 2D visual descriptors; bag-of-visual-features approaches; dynamic information; human actions recognition; object recognition; occlusion; spatio-temporal video planes; Biological system modeling; Computer science; Humans; Layout; Object detection; Object recognition; Proposals; Robustness; Spatiotemporal phenomena; Videos;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia Signal Processing, 2009. MMSP '09. IEEE International Workshop on
  • Conference_Location
    Rio De Janeiro
  • Print_ISBN
    978-1-4244-4463-2
  • Electronic_ISBN
    978-1-4244-4464-9
  • Type

    conf

  • DOI
    10.1109/MMSP.2009.5293303
  • Filename
    5293303