• DocumentCode
    639377
  • Title

    3D R Transform on Spatio-temporal Interest Points for Action Recognition

  • Author

    Chunfeng Yuan ; Xi Li ; Weiming Hu ; Haibin Ling ; Maybank, Steve

  • Author_Institution
    Nat. Lab. of Pattern Recognition, Inst. of Autom., Beijing, China
  • fYear
    2013
  • fDate
    23-28 June 2013
  • Firstpage
    724
  • Lastpage
    730
  • Abstract
    Spatio-temporal interest points serve as an elementary building block in many modern action recognition algorithms, and most of them exploit the local spatio-temporal volume features using a Bag of Visual Words (BOVW) representation. Such representation, however, ignores potentially valuable information about the global spatio-temporal distribution of interest points. In this paper, we propose a new global feature to capture the detailed geometrical distribution of interest points. It is calculated by using the R transform which is defined as an extended 3D discrete Radon transform, followed by applying a two-directional two-dimensional principal component analysis. Such R feature captures the geometrical information of the interest points and keeps invariant to geometry transformation and robust to noise. In addition, we propose a new fusion strategy to combine the R feature with the BOVW representation for further improving recognition accuracy. We utilize a context-aware fusion method to capture both the pairwise similarities and higher-order contextual interactions of the videos. Experimental results on several publicly available datasets demonstrate the effectiveness of the proposed approach for action recognition.
  • Keywords
    Radon transforms; computational geometry; feature extraction; image motion analysis; image representation; 3D R transform; BOVW representation; Bag of Visual Words; extended 3D discrete Radon transform; geometrical distribution; geometrical information; local spatio temporal volume features; modern action recognition algorithms; principal component analysis; spatio temporal interest points; Accuracy; Context; Feature extraction; Kernel; Three-dimensional displays; Transforms; Videos;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on
  • Conference_Location
    Portland, OR
  • ISSN
    1063-6919
  • Type

    conf

  • DOI
    10.1109/CVPR.2013.99
  • Filename
    6618943