Learning better image representations using ‘flobject analysis’

Author

Li, Patrick S. ; Givoni, Inmar E. ; Frey, Brendan J.

Author_Institution

Univ. of Toronto, Toronto, ON, Canada

fYear

2011

fDate

20-25 June 2011

Firstpage

2721

Lastpage

2728

Abstract

Unsupervised learning can be used to extract image representations that are useful for various and diverse vision tasks. After noticing that most biological vision systems for interpreting static images are trained using disparity information, we developed an analogous framework for unsupervised learning. The output of our method is a model that can generate a vector representation or descriptor from any static image. However, the model is trained using pairs of consecutive video frames, which are used to find representations that are consistent with optical flow-derived objects, or `flobjects´. To demonstrate the flobject analysis framework, we extend the latent Dirichlet allocation bag-of-words model to account for real-valued word-specific flow vectors and image-specific probabilistic associations between flow clusters and topics. We show that the static image representations extracted using our method can be used to achieve higher classification rates and better generalization than standard topic models, spatial pyramid matching and gist descriptors.

Keywords

computer vision; feature extraction; image representation; image sequences; pattern clustering; unsupervised learning; biological vision systems; disparity information; flobject analysis; flow clusters; gist descriptors; image representation extraction; image-specific probabilistic associations; latent Dirichlet allocation bag-of-words model; optical flow-derived objects; real-valued word-specific flow vectors; spatial pyramid matching; standard topic models; static images; unsupervised learning; vector representation; video frames; Analytical models; Feature extraction; Histograms; Kernel; Optical imaging; Support vector machines; Training;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on

Conference_Location

Providence, RI

ISSN

1063-6919

Print_ISBN

978-1-4577-0394-2

Type

conf

DOI

10.1109/CVPR.2011.5995649

Filename

5995649