DocumentCode
2867941
Title
Constructing Visual Vocabularies Using Sparse Coding for Action Recognition
Author
Liu, Changhong ; Yang, Yang ; Chen, Yong
Author_Institution
Sch. of Inf. Eng., Univ. of Sci. & Technol. Beijing, Beijing, China
fYear
2009
fDate
19-20 Dec. 2009
Firstpage
1
Lastpage
4
Abstract
Much of action recognition research is recently based on a bag of words (BOW) representation by quantizing the extracted 3D interest points from videos. The k-means algorithm is commonly used to construct a visual vocabulary. However, it has two major drawbacks. Firstly, the visual vocabulary is sensitive to the vocabulary size and the initialization. Secondly, k-means is unable to capture the salient properties of the videos and this vocabulary may contain a large amount of information redundancy. In this paper, we propose a novel action recognition approach which constructs a visual vocabulary and represents a video by sparse coding followed by the max pooling. Unlike the k-means algorithm, the sparse coding approach can capture the salient properties of videos owing to its powerful discriminative ability. Experiments are conducted on the KTH action dataset. The results demonstrate that our approach achieves better performance than k-means and outperforms most recently proposed methods.
Keywords
feature extraction; image motion analysis; object recognition; video coding; 3D interest points extraction; KTH action dataset; bag of words representation; human action recognition; k-means algorithm; sparse coding; visual vocabularies construction; Computer vision; Data mining; Detectors; Engineering management; Noise reduction; Prototypes; Signal processing algorithms; Technology management; Videos; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Engineering and Computer Science, 2009. ICIECS 2009. International Conference on
Conference_Location
Wuhan
Print_ISBN
978-1-4244-4994-1
Type
conf
DOI
10.1109/ICIECS.2009.5366461
Filename
5366461
Link To Document