Robust Video Object Cosegmentation

Author

Wenguan Wang ; Jianbing Shen ; Xuelong Li ; Porikli, Fatih

Author_Institution

Beijing Lab. of Intell. Inf. Technol., Beijing Inst. of Technol., Beijing, China

Volume

24

Issue

10

fYear

2015

Firstpage

3137

Lastpage

3148

Abstract

With ever-increasing volumes of video data, automatic extraction of salient object regions became even more significant for visual analytic solutions. This surge has also opened up opportunities for taking advantage of collective cues encapsulated in multiple videos in a cooperative manner. However, it also brings up major challenges, such as handling of drastic appearance, motion pattern, and pose variations, of foreground objects as well as indiscriminate backgrounds. Here, we present a co segmentation framework to discover and segment out common object regions across multiple frames and multiple videos in a joint fashion. We incorporate three types of cues, i.e., intraframe saliency, interframe consistency, and across-video similarity into an energy optimization framework that does not make restrictive assumptions on foreground appearance and motion model, and does not require objects to be visible in all frames. We also introduce a spatio-temporal scale-invariant feature transform (SIFT) flow descriptor to integrate across-video correspondence from the conventional SIFT-flow into interframe motion flow from optical flow. This novel spatio-temporal SIFT flow generates reliable estimations of common foregrounds over the entire video data set. Experimental results show that our method outperforms the state-of-the-art on a new extensive data set (ViCoSeg).

Keywords

feature extraction; image motion analysis; image segmentation; optimisation; pose estimation; transforms; video signal processing; across-video similarity; automatic salient object region extraction; drastic appearance; energy optimization framework; indiscriminate backgrounds; interframe consistency; intraframe saliency; motion pattern; pose variations; robust video object cosegmentation; spatio-temporal SIFT flow descriptor; spatio-temporal scale-invariant feature transform flow descriptor; video data; visual analytic solutions; Integrated optics; Joints; Motion segmentation; Object segmentation; Optical imaging; Optimization; Video sequences; Video object co-segmentation; energy optimization; object refinement; spatio-temporal SIFT flow; spatio-temporal scale-invariant feature transform (SIFT) flow;

fLanguage

English

Journal_Title

Image Processing, IEEE Transactions on

Publisher

ieee

ISSN

1057-7149

Type

jour

DOI

10.1109/TIP.2015.2438550

Filename

7113836