مرکز منطقه ای اطلاع رساني علوم و فناوري - Simultaneous Object Recognition and Localization in Image Collections

DocumentCode :

2501145

Title :

Simultaneous Object Recognition and Localization in Image Collections

Author :

Wang, Shao-Chuan ; Wang, Yu-Chiang Frank

Author_Institution :

Res. Center for Inf. Technol. Innovation, Acad. Sinica, Taipei, Taiwan

fYear :

2010

fDate :

Aug. 29 2010-Sept. 1 2010

Firstpage :

497

Lastpage :

504

Abstract :

This papers presents a weakly supervised method to simultaneously address object localization and recognition problems. Unlike prior work using exhaustive search methods such as sliding windows, we propose to learn category and image-specific visual words in image collections by extracting discriminating feature information via two different types of support vector machines: the standard L2-regularized L1-loss SVM, and the one with L1 regularization and L2 loss. The selected visual words are used to construct visual attention maps, which provide descriptive information for each object category. To preserve local spatial information, we further refine these maps by Gaussian smoothing and cross bilateral filtering, and thus both appearance and spatial information can be utilized for visual categorization applications. Our method is not limited to any specific type of image descriptors, or any particular codebook learning and feature encoding techniques. In this paper, we conduct preliminary experiments on a subset of the Caltech-256 dataset using bag-of-feature (BOF) models with SIFT descriptors. We show that the use of our visual attention maps improves the recognition performance, while the one selected by L1-regularized L2-loss SVMs exhibits the best recognition and localization results.

Keywords :

filtering theory; image coding; image recognition; support vector machines; Caltech-256 dataset; Gaussian smoothing; LI-regularized L2-loss SVM; SIFT descriptors; bag-of-feature models; category visual words; codebook learning; cross bilateral filtering; exhaustive search methods; feature encoding techniques; feature information extraction; image collections; image descriptors; image-specific visual words; local spatial information; object category; object localization; object recognition; support vector machines; visual attention maps; visual categorization applications; Computational modeling; Histograms; Object recognition; Pixel; Support vector machines; Training; Visualization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Advanced Video and Signal Based Surveillance (AVSS), 2010 Seventh IEEE International Conference on

Conference_Location :

Boston, MA

Print_ISBN :

978-1-4244-8310-5

Type :

conf

DOI :

10.1109/AVSS.2010.47

Filename :

5597094

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2501145