DocumentCode :
3494802
Title :
An efficient framework on large-scale video genre classification
Author :
Zhang, Ning ; Guan, Ling
Author_Institution :
Ryerson Multimedia Res. Lab., Ryerson Univ., Toronto, ON, Canada
fYear :
2010
fDate :
4-6 Oct. 2010
Firstpage :
481
Lastpage :
486
Abstract :
Efficient data mining and indexing is important for multimedia analysis and retrieval. In the field of large-scale video analysis, effective genre categorization plays an important role and serves one of the fundamental steps for data mining. Existing works utilize domain-knowledge dependent feature extraction, which is limited from genre diversification as well as data volume scalability. In this paper, we propose a systematic framework for automatically classifying video genres using domain-knowledge independent descriptors in feature extraction, and a bag-of-visualwords (BoW) based model in compact video representation. Scale invariant feature transform (SIFT) local descriptor accelerated by GPU hardware is adopted for feature extraction. BoW model with an innovative codebook generation using bottom-up two-layer K-means clustering is proposed to abstract the video characteristics. Besides the histogram-based distribution in summarizing video data, a modified latent Dirichlet allocation (mLDA) based distribution is also introduced. At the classification stage, a k-nearest neighbor (k-NN) classifier is employed. Compared with state of art large-scale genre categorization in, the experimental results on a 23-sports dataset demonstrate that our proposed framework achieves a comparable classification accuracy with 27% and 64% expansion in data volume and diversity, respectively.
Keywords :
data mining; feature extraction; image classification; indexing; pattern clustering; video retrieval; video signal processing; BoW model; GPU hardware; bottom-up two-layer K-means clustering; data diversity; data mining; data volume scalability; domain-knowledge independent descriptors; feature extraction; genre diversification; histogram-based distribution; indexing; innovative codebook generation; k-nearest neighbor classifier; large-scale video genre classification; modified latent Dirichlet allocation based distribution; multimedia analysis; multimedia retrieval; scale invariant feature transform local descriptor; Accuracy; Computational modeling; Feature extraction; Graphics processing unit; Histograms; Mathematical model; Periodic structures;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia Signal Processing (MMSP), 2010 IEEE International Workshop on
Conference_Location :
Saint Malo
Print_ISBN :
978-1-4244-8110-1
Electronic_ISBN :
978-1-4244-8111-8
Type :
conf
DOI :
10.1109/MMSP.2010.5662069
Filename :
5662069
Link To Document :
بازگشت