DocumentCode :
177892
Title :
Multiple Instance Dictionary Learning for Activity Representation
Author :
Umakanthan, S. ; Denman, S. ; Fookes, C. ; Sridharan, S.
Author_Institution :
Image & Video Res. Lab., Queensland Univ. of Technol., Brisbane, QLD, Australia
fYear :
2014
fDate :
24-28 Aug. 2014
Firstpage :
1377
Lastpage :
1382
Abstract :
This paper presents an effective feature representation method in the context of activity recognition. Efficient and effective feature representation plays a crucial role not only in activity recognition, but also in a wide range of applications such as motion analysis, tracking, 3D scene understanding etc. In the context of activity recognition, local features are increasingly popular for representing videos because of their simplicity and efficiency. While they achieve state-of-the-art performance with low computational requirements, their performance is still limited for real world applications due to a lack of contextual information and models not being tailored to specific activities. We propose a new activity representation framework to address the shortcomings of the popular, but simple bag-of-words approach. In our framework, first multiple instance SVM (mi-SVM) is used to identify positive features for each action category and the k-means algorithm is used to generate a codebook. Then locality-constrained linear coding is used to encode the features into the generated codebook, followed by spatio-temporal pyramid pooling to convey the spatio-temporal statistics. Finally, an SVM is used to classify the videos. Experiments carried out on two popular datasets with varying complexity demonstrate significant performance improvement over the base-line bag-of-feature method.
Keywords :
feature extraction; image classification; image motion analysis; spatiotemporal phenomena; support vector machines; video signal processing; 3D scene understanding; activity recognition; activity representation; base-line bag-of-feature method; feature representation method; k-means algorithm; local features; locality-constrained linear coding; mi-SVM; motion analysis; motion tracking; multiple instance SVM; multiple instance dictionary learning; spatio-temporal pyramid pooling; spatio-temporal statistics; video classification; Context; Dictionaries; Encoding; Feature extraction; Histograms; Support vector machines; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition (ICPR), 2014 22nd International Conference on
Conference_Location :
Stockholm
ISSN :
1051-4651
Type :
conf
DOI :
10.1109/ICPR.2014.246
Filename :
6976956
Link To Document :
بازگشت