Title :
Scene understanding with discriminative structured prediction
Author :
Yuan, Jinhui ; Li, Jianmin ; Zhang, Bo
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Tsinghua
Abstract :
Spatial priors play crucial roles in many high-level vision tasks, e.g. scene understanding. Usually, learning spatial priors relies on training a structured output model. In this paper, two special cases of discriminative structured output model, i.e. conditional random fields (CRFs) and max-margin Markov networks (M3N), are demonstrated to perform image scene understanding. The two models are empirically compared in a fair manner, i.e. using the common feature representation and the same optimization algorithm. Particularly, we adopt online exponentiated gradient (EG) algorithm to solve the convex duals of both models. We describe the general procedure of EG algorithm and present a two-stage training procedure to overcome the degeneration of EG when exact inference is intractable. Experiments on a large scale image region annotation task are carried out. The results show that both models yield encouraging results but CRFs slightly outperforms M3N.
Keywords :
Markov processes; computer vision; conditional random fields; discriminative structured prediction; exponentiated gradient algorithm; high-level vision tasks; image region annotation task; max-margin Markov networks; scene understanding; Computer science; Computer vision; Inference algorithms; Information science; Intelligent structures; Intelligent systems; Laboratories; Large-scale systems; Layout; Markov random fields;
Conference_Titel :
Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on
Conference_Location :
Anchorage, AK
Print_ISBN :
978-1-4244-2242-5
Electronic_ISBN :
1063-6919
DOI :
10.1109/CVPR.2008.4587602