DocumentCode :
1759982
Title :
Modeling Geometric-Temporal Context With Directional Pyramid Co-Occurrence for Action Recognition
Author :
Chunfeng Yuan ; Xi Li ; Weiming Hu ; Haibin Ling ; Maybank, Stephen J.
Author_Institution :
Nat. Lab. of Pattern Recognition, Inst. of Autom., Beijing, China
Volume :
23
Issue :
2
fYear :
2014
fDate :
Feb. 2014
Firstpage :
658
Lastpage :
672
Abstract :
In this paper, we present a new geometric-temporal representation for visual action recognition based on local spatio-temporal features. First, we propose a modified covariance descriptor under the log-Euclidean Riemannian metric to represent the spatio-temporal cuboids detected in the video sequences. Compared with previously proposed covariance descriptors, our descriptor can be measured and clustered in Euclidian space. Second, to capture the geometric-temporal contextual information, we construct a directional pyramid co-occurrence matrix (DPCM) to describe the spatio-temporal distribution of the vector-quantized local feature descriptors extracted from a video. DPCM characterizes the co-occurrence statistics of local features as well as the spatio-temporal positional relationships among the concurrent features. These statistics provide strong descriptive power for action recognition. To use DPCM for action recognition, we propose a directional pyramid co-occurrence matching kernel to measure the similarity of videos. The proposed method achieves the state-of-the-art performance and improves on the recognition performance of the bag-of-visual-words (BOVWs) models by a large margin on six public data sets. For example, on the KTH data set, it achieves 98.78% accuracy while the BOVW approach only achieves 88.06%. On both Weizmann and UCF CIL data sets, the highest possible accuracy of 100% is achieved.
Keywords :
geometry; gesture recognition; image retrieval; image sequences; matrix algebra; video signal processing; BOVW models; DPCM; KTH data set; bag-of-visual- words models; co-occurrence statistics; directional pyramid co-occurrence matrix; geometric-temporal contextual information; geometric-temporal representation; local spatio- temporal features; log-Euclidean Riemannian metric; modified covariance descriptor; spatio-temporal distribution; spatio-temporal positional relationships; vector-quantized local feature descriptors; video sequences; visual action recognition; Covariance matrices; Feature extraction; Kernel; Measurement; Three-dimensional displays; Vectors; Video sequences; Covariance cuboid descriptor; action recognition; kernel machine; log-Euclidean Riemannian metric; spatio-temporal directional pyramid co-occurrence matrix;
fLanguage :
English
Journal_Title :
Image Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1057-7149
Type :
jour
DOI :
10.1109/TIP.2013.2291319
Filename :
6665089
Link To Document :
بازگشت