DocumentCode
2608489
Title
Combined Category Visual Vocabulary: A new approach to visual vocabulary construction
Author
Zhang Jianjia ; Luo Limin
Author_Institution
Lab. of Image Sci. & Technol., Southeast Univ., Nanjing, China
Volume
3
fYear
2011
fDate
15-17 Oct. 2011
Firstpage
1409
Lastpage
1415
Abstract
Bag-of-words model (BoW), inspired by the problem of text representation and classification, has attracted intensive attention in object and scene categorization for its flexibility and good performance. In BOW model, a visual vocabulary is obtained by clustering local patches detected from training image set, and then an image can be represented by the histogram of visual words. However, how to construct an effective visual vocabulary is still a crucial and challenging step in BoW model. The conventional methods to construct a visual vocabulary are very time-consuming, and also the obtained vocabulary is not discriminative enough. We propose a novel approach to construct a visual vocabulary: Combined Category Visual Vocabulary (CCVV). Firstly, a category visual vocabulary for each image category is obtained. Then all these category visual vocabularies are combined together to form a general visual vocabulary, which is able to be used to represent images. The visual words in CCVV are related to one image category, so they have higher discriminative ability to separate the image category from others. The proposed approach also decreases the computational complexity by clustering local patches from only one category instead of all categories. Object interest local patches are obtained by means of the Harris-Affine detector and described by scale invariant feature transform (SIFT) descriptor. Support vector machine (SVM) is utilized to train a classifier in our experiment. The proposed approach is evaluated on the VOC 2006 database, and the experimental results demonstrate that the proposed approach is more computationally efficient and superior performance than conventional approaches.
Keywords
image representation; pattern clustering; support vector machines; Harris-Affine detector; SIFT descriptor; bag-of-words model; category visual vocabulary; computational complexity; image category; image representation; local patch clustering; scale invariant feature transform; support vector machine; Feature extraction; Histograms; Principal component analysis; Support vector machines; Training; Visualization; Vocabulary; BOW; CCVV; combined category visual vocabulary; scene classification; visual categorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Image and Signal Processing (CISP), 2011 4th International Congress on
Conference_Location
Shanghai
Print_ISBN
978-1-4244-9304-3
Type
conf
DOI
10.1109/CISP.2011.6100500
Filename
6100500
Link To Document