Discriminatively Trained And-Or Tree Models for Object Detection

Author

Xi Song ; Tianfu Wu ; Yunde Jia ; Song-Chun Zhu

Author_Institution

Lab. of Intell. Inf. Technol., Beijing Inst. of Technol., Beijing, China

fYear

2013

fDate

23-28 June 2013

Firstpage

3278

Lastpage

3285

Abstract

This paper presents a method of learning reconfigurable And-Or Tree (AOT) models discriminatively from weakly annotated data for object detection. To explore the appearance and geometry space of latent structures effectively, we first quantize the image lattice using an over complete set of shape primitives, and then organize them into a directed a cyclic And-Or Graph (AOG) by exploiting their compositional relations. We allow overlaps between child nodes when combining them into a parent node, which is equivalent to introducing an appearance Or-node implicitly for the overlapped portion. The learning of an AOT model consists of three components: (i) Unsupervised sub-category learning (i.e., branches of an object Or-node) with the latent structures in AOG being integrated out. (ii) Weakly supervised part configuration learning (i.e., seeking the globally optimal parse trees in AOG for each sub-category). To search the globally optimal parse tree in AOG efficiently, we propose a dynamic programming (DP) algorithm. (iii) Joint appearance and structural parameters training under latent structural SVM framework. In experiments, our method is tested on PASCAL VOC 2007 and 2010 detection benchmarks of 20 object classes and outperforms comparable state-of-the-art methods.

Keywords

directed graphs; dynamic programming; learning (artificial intelligence); object detection; 2010 detection benchmarks; PASCAL VOC 2007; directed a cyclic and-or graph; discriminatively trained and-or tree models; dynamic programming algorithm; geometry space; globally optimal parse trees; image lattice; object detection; object or-node; reconfigurable and-or tree models; shape primitives; structural SVM framework; structural parameters training; unsupervised sub-category learning; weakly annotated data; Deformable models; Lattices; Object detection; Shape; Space exploration; Support vector machines; Training; And-Or Graph; Latent Structural SVM; Object Detection; Part-based Representation;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on

Conference_Location

Portland, OR

ISSN

1063-6919

Type

conf

DOI

10.1109/CVPR.2013.421

Filename

6619265