مرکز منطقه ای اطلاع رساني علوم و فناوري - Exploring the optimal visual vocabulary sizes for semantic concept detection

DocumentCode :

629073

Title :

Exploring the optimal visual vocabulary sizes for semantic concept detection

Author :

Jinlin Guo ; Zhengwei Qiu ; Gurrin, C.

Author_Institution :

CLARITY & Sch. of Comput., Dublin City Univ., Dublin, Ireland

fYear :

2013

fDate :

17-19 June 2013

Firstpage :

109

Lastpage :

114

Abstract :

The framework based on the Bag-of-Visual-Words (BoVW) feature representation and SVM classification is popularly used for generic content-based concept detection or visual categorization. However, visual vocabulary (VV) size, one important factor in this framework, is always chosen differently and arbitrarily in previous work. In this paper, we focus on investigating the optimal VV sizes depending on other components of this framework which also govern the performance. This is useful as a default VV size for reducing the computation cost. By unsupervised clustering, a series of VVs covering a wide range of sizes are evaluated under two popular local features, three assignment modes, and four kernels on two different-scale benchmarking datasets respectively. These factors are also evaluated. Experimental results show that best VV sizes vary as these factors change. However, the concept detection performance usually improves as the VV size increases initially, and then gains less, or even deteriorates if larger VVs are used since overfitting occurs. Overall, VVs with sizes ranging from 1024 to 4096 achieve best performance with higher probability when compared with other-size VVs. With regard to the other factors, experimental results show that the OpponentSIFT descriptor outperforms the SURF feature, and soft assignment mode yields better performance than binary and hard assignment. In addition, generalized RBF kernels such as X² and Laplace RBF kernels are more appropriate for semantic concept detection with SVM classification.

Keywords :

feature extraction; image classification; image representation; object detection; pattern clustering; radial basis function networks; support vector machines; transforms; χ2; BoVW; Laplace RBF kernels; OpponentSIFT descriptor; SURF feature; SVM classification; VV; assignment modes; bag-of-visual-words feature representation; content-based concept detection; local features; optimal visual vocabulary sizes; semantic concept detection; soft assignment mode; unsupervised clustering; visual categorization; Feature extraction; Kernel; Semantics; Support vector machines; TV; Visualization; Vocabulary;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Content-Based Multimedia Indexing (CBMI), 2013 11th International Workshop on

Conference_Location :

Veszprem

ISSN :

1949-3983

Print_ISBN :

978-1-4799-0955-1

Type :

conf

DOI :

10.1109/CBMI.2013.6576565

Filename :

6576565

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=629073