DocumentCode
3018801
Title
Beyond bottom-up: Incorporating task-dependent influences into a computational model of spatial attention
Author
Peters, Robert J. ; Itti, Laurent
Author_Institution
Univ. of Southern California, Los Angeles
fYear
2007
fDate
17-22 June 2007
Firstpage
1
Lastpage
8
Abstract
A critical function in both machine vision and biological vision systems is attentional selection of scene regions worthy of further analysis by higher-level processes such as object recognition. Here we present the first model of spatial attention that (1) can be applied to arbitrary static and dynamic image sequences with interactive tasks and (2) combines a general computational implementation of both bottom-up (BU) saliency and dynamic top-down (TD) task relevance; the claimed novelty lies in the combination of these elements and in the fully computational nature of the model. The BU component computes a saliency map from 12 low-level multi-scale visual features. The TD component computes a low-level signature of the entire image, and learns to associate different classes of signatures with the different gaze patterns recorded from human subjects performing a task of interest. We measured the ability of this model to predict the eye movements of people playing contemporary video games. We found that the TD model alone predicts where humans look about twice as well as does the BU model alone; in addition, a combined BU*TD model performs significantly better than either individual component. Qualitatively, the combined model predicts some easy-to-describe but hard-to-compute aspects of attentional selection, such as shifting attention leftward when approaching a left turn along a racing track. Thus, our study demonstrates the advantages of integrating BU factors derived from a saliency map and TD factors learned from image and task contexts in predicting where humans look while performing complex visually-guided behavior.
Keywords
computer vision; feature extraction; image sequences; object recognition; TD factors; biological vision systems; bottom-up saliency; dynamic image sequences; dynamic top-down; machine vision; multi-scale visual features; object recognition; spatial attention; task contexts; task-dependent influences; Biological system modeling; Biology computing; Computational modeling; Games; Humans; Image sequences; Layout; Machine vision; Object recognition; Predictive models;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Vision and Pattern Recognition, 2007. CVPR '07. IEEE Conference on
Conference_Location
Minneapolis, MN
ISSN
1063-6919
Print_ISBN
1-4244-1179-3
Electronic_ISBN
1063-6919
Type
conf
DOI
10.1109/CVPR.2007.383337
Filename
4270335
Link To Document