Weakly Supervised Photo Cropping

Author

Luming Zhang ; Mingli Song ; Yi Yang ; Qi Zhao ; Chen Zhao ; Sebe, Nicu

Author_Institution

Coll. of Comput. Sci., Zhejiang Univ., Hangzhou, China

Volume

16

Issue

1

fYear

2014

fDate

Jan. 2014

Firstpage

94

Lastpage

107

Abstract

Photo cropping is widely used in the printing industry, photography, and cinematography. Conventional photo cropping methods suffer from three drawbacks: 1) the semantics used to describe photo aesthetics are determined by the experience of model designers and specific data sets, 2) image global configurations, an essential cue to capture photos aesthetics, are not well preserved in the cropped photo, and 3) multi-channel visual features from an image region contribute differently to human aesthetics, but state-of-the-art photo cropping methods cannot automatically weight them. Owing to the recent progress in image retrieval community, image-level semantics, i.e., photo labels obtained without much human supervision, can be efficiently and effectively acquired. Thus, we propose weakly supervised photo cropping, where a manifold embedding algorithm is developed to incorporate image-level semantics and image global configurations with graphlets, or, small-sized connected subgraph. After manifold embedding, a Bayesian Network (BN) is proposed. It incorporates the testing photo into the framework derived from the multi-channel post-embedding graphlets of the training data, the importance of which is determined automatically. Based on the BN, photo cropping can be casted as searching the candidate cropped photo that maximally preserves graphlets from the training photos, and the optimal cropping parameter is inferred by Gibbs sampling. Subjective evaluations demonstrate that: 1) our approach outperforms several representative photo cropping methods, including our previous cropping model that is guided by semantics-free graphlets, and 2) the visualized graphlets explicitly capture photo semantics and global spatial configurations.

Keywords

feature extraction; image retrieval; BN; Bayesian network; Gibbs sampling; cinematography; global spatial configuration; human aesthetics; image global configurations; image region; image retrieval community; image-level semantics; manifold embedding algorithm; model designers; multichannel post-embedding graphlets; multichannel visual features; optimal cropping parameter; photo aesthetics; photo cropping method; photo labels; photography; printing industry; semantics-free graphlets; small-sized connected subgraph; specific data sets; supervised photo cropping; training data; training photos; visualized graphlets; Educational institutions; Feature extraction; Layout; Manifolds; Semantics; Vectors; Visualization; Bayesian network; image-level semantics; photo cropping; weakly supervised;

fLanguage

English

Journal_Title

Multimedia, IEEE Transactions on

Publisher

ieee

ISSN

1520-9210

Type

jour

DOI

10.1109/TMM.2013.2286817

Filename

6644258