Object Recognition via Adaptive Multi-level Feature Integration

Author

Wang, Mei ; Wu, Yanling ; Li, Guangda ; Zhou, Xiangdong

Author_Institution

Sch. of Comput. Sci. & Technol., Donghua Univ., Shanghai, China

fYear

2010

fDate

6-8 April 2010

Firstpage

253

Lastpage

259

Abstract

Object category recognition is a challenging task due to the low level and non-discrimination in visual representation. Most previous methods concentrate to find better high level visual features. Recently, optimally integrating various features to solve the problem attracted more interests. In this paper, we provide a novel method for object category recognition by improving the popular bag-of-words (BoW) methods from the following two aspects. First, we propose to extract a series of high level visual features which exploit both the local spatial co occurrence between low level visual words and the global spatial layout of the object parts. To obtain the global spatial features, a fast method is proposed to generate the semantic meaningful object parts by exploiting the geometric position distribution of the local salient regions. The image part patches are further quantized as semantic coherent high level visual words by using correlational spectral clustering. Based on it, simplified 2D string representation is introduced to model the global spatial patterns of the objects. Second, a multi-kernel learning framework is proposed to adaptively integrate extracted features in an optimal way. For each object class, an optimal feature weight coefficient is learned automatically and separately to combine both the low level and high level visual features by considering their contribution for the different object class. The tests on Caltech-101 and Pascal- VOC 06 dataset demonstrated that our method outperforms the baseline method BoW and state-of-the-art Multi-CM model .

Keywords

feature extraction; image representation; learning (artificial intelligence); object recognition; pattern classification; pattern clustering; quantisation (signal); Caltech-101 dataset; Pascal-VOC 06 dataset; adaptive multilevel feature integration; bag-of-word methods; baseline BoW method; correlational spectral clustering; geometric position distribution; high level visual feature extraction; local salient regions; multikernel learning framework; object category recognition; optimal feature weight coefficient; semantic coherent high level visual word quantization; simplified 2D string representation; visual representation; Computer science; Data mining; Digital images; Feature extraction; Image databases; Information technology; Object recognition; Testing; Visual databases; Windows;

fLanguage

English

Publisher

ieee

Conference_Titel

Web Conference (APWEB), 2010 12th International Asia-Pacific

Conference_Location

Busan

Print_ISBN

978-1-7695-4012-2

Electronic_ISBN

978-1-4244-6600-9

Type

conf

DOI

10.1109/APWeb.2010.24

Filename

5474128