• DocumentCode
    35286
  • Title

    Pose Estimation and Segmentation of Multiple People in Stereoscopic Movies

  • Author

    Seguin, Guillaume ; Alahari, Karteek ; Sivic, Josef ; Laptev, Ivan

  • Author_Institution
    Dept. d´Inf., Ecole Normale Super., Paris, France
  • Volume
    37
  • Issue
    8
  • fYear
    2015
  • fDate
    Aug. 1 2015
  • Firstpage
    1643
  • Lastpage
    1655
  • Abstract
    We describe a method to obtain a pixel-wise segmentation and pose estimation of multiple people in stereoscopic videos. This task involves challenges such as dealing with unconstrained stereoscopic video, non-stationary cameras, and complex indoor and outdoor dynamic scenes with multiple people. We cast the problem as a discrete labelling task involving multiple person labels, devise a suitable cost function, and optimize it efficiently. The contributions of our work are two-fold: First, we develop a segmentation model incorporating person detections and learnt articulated pose segmentation masks, as well as colour, motion, and stereo disparity cues. The model also explicitly represents depth ordering and occlusion. Second, we introduce a stereoscopic dataset with frames extracted from feature-length movies “StreetDance 3D” and “Pina”. The dataset contains 587 annotated human poses, 1,158 bounding box annotations and 686 pixel-wise segmentations of people. The dataset is composed of indoor and outdoor scenes depicting multiple people with frequent occlusions. We demonstrate results on our new challenging dataset, as well as on the H2view dataset from (Sheasby et al. ACCV 2012).
  • Keywords
    cameras; entertainment; feature extraction; image segmentation; motion estimation; pose estimation; stereo image processing; video signal processing; H2view dataset; Pina; StreetDance 3D; annotated human poses; articulated pose segmentation mask; bounding box annotations; colour cue; complex indoor dynamic scenes; complex outdoor dynamic scenes; cost function; depth ordering; discrete labelling task; explicit analysis; feature-length movies; frame extraction; motion cue; multiple person labels; nonstationary cameras; occlusion; person detections; pixel-wise segmentation; pixel-wise segmentations; pose estimation; stereo disparity cue; stereoscopic movies; unconstrained stereoscopic video; Estimation; Feature extraction; Image color analysis; Motion pictures; Motion segmentation; Stereo image processing; Videos; 3D data; Person detection; Pose estimation; Segmentation; Stereo movies; pose estimation; segmentation; stereo movies;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2014.2369050
  • Filename
    6951494