مرکز منطقه ای اطلاع رساني علوم و فناوري - Figure-ground segmentation improves handled object recognition in egocentric video

DocumentCode :

3407196

Title :

Figure-ground segmentation improves handled object recognition in egocentric video

Author :

Ren, Xiaofeng ; Gu, Chunhui

Author_Institution :

Intel Labs. Seattle, Seattle, WA, USA

fYear :

2010

fDate :

13-18 June 2010

Firstpage :

3137

Lastpage :

3144

Abstract :

Identifying handled objects, i.e. objects being manipulated by a user, is essential for recognizing the person´s activities. An egocentric camera as worn on the body enjoys many advantages such as having a natural first-person view and not needing to instrument the environment. It is also a challenging setting, where background clutter is known to be a major source of problems and is difficult to handle with the camera constantly and arbitrarily moving. In this work we develop a bottom-up motion-based approach to robustly segment out foreground objects in egocentric video and show that it greatly improves object recognition accuracy. Our key insight is that egocentric video of object manipulation is a special domain and many domain-specific cues can readily help. We compute dense optical flow and fit it into multiple affine layers. We then use a max-margin classifier to combine motion with empirical knowledge of object location and background movement as well as temporal cues of support region and color appearance. We evaluate our segmentation algorithm on the large Intel Egocentric Object Recognition dataset with 42 objects and 100K frames. We show that, when combined with temporal integration, figure-ground segmentation improves the accuracy of a SIFT-based recognition system from 33% to 60%, and that of a latent-HOG system from 64% to 86%.

Keywords :

image recognition; image sequences; video signal processing; Intel Egocentric Object Recognition dataset; SIFT-based recognition system; background movement; bottom-up motion-based approach; dense optical flow; egocentric camera; egocentric video; figure-ground segmentation; multiple affine layers; object location; object manipulation; Cameras; Computer vision; Image motion analysis; Instruments; Mobile computing; Motion segmentation; Object recognition; Optical computing; Robustness; Wearable computers;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on

Conference_Location :

San Francisco, CA

ISSN :

1063-6919

Print_ISBN :

978-1-4244-6984-0

Type :

conf

DOI :

10.1109/CVPR.2010.5540074

Filename :

5540074

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3407196