Title :
Learning Structured Appearance Models from Captioned Images of Cluttered Scenes
Author :
Jamieson, Michael ; Fazly, Afsaneh ; Dickinson, Sven ; Stevenson, Suzanne ; Wachsmuth, Sven
Author_Institution :
Univ. of Toronto, Toronto
Abstract :
Given an unstructured collection of captioned images of cluttered scenes featuring a variety of objects, our goal is to learn both the names and appearances of the objects. Only a small number of local features within any given image are associated with a particular caption word. We describe a connected graph appearance model where vertices represent local features and edges encode spatial relationships. We use the repetition of feature neighborhoods across training images and a measure of correspondence with caption words to guide the search for meaningful feature configurations. We demonstrate improved results on a dataset to which an unstructured object model was previously applied. We also apply the new method to a more challenging collection of captioned images from the Web, detecting and annotating objects within highly cluttered realistic scenes.
Keywords :
graph theory; learning (artificial intelligence); realistic images; captioned images; cluttered realistic scenes; connected graph appearance model; feature neighborhoods; learning structured appearance models; training images; Deformable models; Image databases; Image edge detection; Image storage; Layout; Object detection; Robustness; Shape; Space exploration; Visual databases;
Conference_Titel :
Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on
Conference_Location :
Rio de Janeiro
Print_ISBN :
978-1-4244-1630-1
Electronic_ISBN :
1550-5499
DOI :
10.1109/ICCV.2007.4408877