DocumentCode
3748769
Title
Predicting Multiple Structured Visual Interpretations
Author
Debadeepta Dey;Varun Ramakrishna;Martial Hebert;J. Andrew Bagnell
Author_Institution
Carnegie Mellon Univ., Pittsburgh, PA, USA
fYear
2015
Firstpage
2947
Lastpage
2955
Abstract
We present a simple approach for producing a small number of structured visual outputs which have high recall, for a variety of tasks including monocular pose estimation and semantic scene segmentation. Current state-of-the-art approaches learn a single model and modify inference procedures to produce a small number of diverse predictions. We take the alternate route of modifying the learning procedure to directly optimize for good, high recall sequences of structured-output predictors. Our approach introduces no new parameters, naturally learns diverse predictions and is not tied to any specific structured learning or inference procedure. We leverage recent advances in the contextual submodular maximization literature to learn a sequence of predictors and empirically demonstrate the simplicity and performance of our approach on multiple challenging vision tasks including achieving state-of-the-art results on multiple predictions for monocular pose-estimation and image foreground/background segmentation.
Keywords
"Predictive models","Labeling","Adaptation models","Computer vision","Semantics","Inference algorithms"
Publisher
ieee
Conference_Titel
Computer Vision (ICCV), 2015 IEEE International Conference on
Electronic_ISBN
2380-7504
Type
conf
DOI
10.1109/ICCV.2015.337
Filename
7410694
Link To Document