مرکز منطقه ای اطلاع رساني علوم و فناوري - Cutting Edge: Soft Correspondences in Multimodal Scene Parsing

DocumentCode :

3748574

Title :

Cutting Edge: Soft Correspondences in Multimodal Scene Parsing

Author :

Sarah Taghavi Namin;Mohammad Najafi;Mathieu Salzmann;Lars Petersson

Author_Institution :

Australian Nat. Univ., Canberra, ACT, Australia

fYear :

2015

Firstpage :

1188

Lastpage :

1196

Abstract :

Exploiting multiple modalities for semantic scene parsing has been shown to improve accuracy over the single modality scenario. Existing methods, however, assume that corresponding regions in two modalities have the same label. In this paper, we address the problem of data misalignment and label inconsistencies, e.g., due to moving objects, in semantic labeling, which violate the assumption of existing techniques. To this end, we formulate multimodal semantic labeling as inference in a CRF, and introduce latent nodes to explicitly model inconsistencies between two domains. These latent nodes allow us not only to leverage information from both domains to improve their labeling, but also to cut the edges between inconsistent regions. To eliminate the need for hand tuning the parameters of our model, we propose to learn intra-domain and inter-domain potential functions from training data. We demonstrate the benefits of our approach on two publicly available datasets containing 2D imagery and 3D point clouds. Thanks to our latent nodes and our learning strategy, our method outperforms the state-of-the-art in both cases.

Keywords :

"Three-dimensional displays","Labeling","Semantics","Feature extraction","Image analysis","Sensors","Laser radar"

Publisher :

ieee

Conference_Titel :

Computer Vision (ICCV), 2015 IEEE International Conference on

Electronic_ISBN :

2380-7504

Type :

conf

DOI :

10.1109/ICCV.2015.141

Filename :

7410498

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3748574