DocumentCode :
2501145
Title :
Simultaneous Object Recognition and Localization in Image Collections
Author :
Wang, Shao-Chuan ; Wang, Yu-Chiang Frank
Author_Institution :
Res. Center for Inf. Technol. Innovation, Acad. Sinica, Taipei, Taiwan
fYear :
2010
fDate :
Aug. 29 2010-Sept. 1 2010
Firstpage :
497
Lastpage :
504
Abstract :
This papers presents a weakly supervised method to simultaneously address object localization and recognition problems. Unlike prior work using exhaustive search methods such as sliding windows, we propose to learn category and image-specific visual words in image collections by extracting discriminating feature information via two different types of support vector machines: the standard L2-regularized L1-loss SVM, and the one with L1 regularization and L2 loss. The selected visual words are used to construct visual attention maps, which provide descriptive information for each object category. To preserve local spatial information, we further refine these maps by Gaussian smoothing and cross bilateral filtering, and thus both appearance and spatial information can be utilized for visual categorization applications. Our method is not limited to any specific type of image descriptors, or any particular codebook learning and feature encoding techniques. In this paper, we conduct preliminary experiments on a subset of the Caltech-256 dataset using bag-of-feature (BOF) models with SIFT descriptors. We show that the use of our visual attention maps improves the recognition performance, while the one selected by L1-regularized L2-loss SVMs exhibits the best recognition and localization results.
Keywords :
filtering theory; image coding; image recognition; support vector machines; Caltech-256 dataset; Gaussian smoothing; LI-regularized L2-loss SVM; SIFT descriptors; bag-of-feature models; category visual words; codebook learning; cross bilateral filtering; exhaustive search methods; feature encoding techniques; feature information extraction; image collections; image descriptors; image-specific visual words; local spatial information; object category; object localization; object recognition; support vector machines; visual attention maps; visual categorization applications; Computational modeling; Histograms; Object recognition; Pixel; Support vector machines; Training; Visualization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Video and Signal Based Surveillance (AVSS), 2010 Seventh IEEE International Conference on
Conference_Location :
Boston, MA
Print_ISBN :
978-1-4244-8310-5
Type :
conf
DOI :
10.1109/AVSS.2010.47
Filename :
5597094
Link To Document :
بازگشت