Automatic Training Image Acquisition and Effective Feature Selection From Community-Contributed Photos for Facial Attribute Detection

Author

Yan-Ying Chen ; Hsu, W.H. ; Liao, Hong-Yuan Mark

Author_Institution

Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Univ., Taipei, Taiwan

Volume

15

Issue

6

fYear

2013

fDate

Oct. 2013

Firstpage

1388

Lastpage

1399

Abstract

Facial attributes are shown effective for mining specific persons and profiling human activities in large-scale media such as surveillance videos or photo-sharing services. For comprehensive analysis, a rich number of facial attributes is required. Generally, each attribute detector is obtained by supervised learning via the use of large training data. It is promising to leverage the exponentially growing community contributed photos and the associated informative contexts to ease the burden of manual annotation; however, such huge noisy data from the Internet still pose great challenges. We propose to measure the quality of training images by discriminable visual features, which are verified with the relative discrimination between the unlabeled images and the pseudo-positives (pseudo-negatives) retrieved by textual relevance. The proposed feature selection requires no heuristic threshold, therefore, can be generalized to multiple feature modalities. We further exploit the rich context cues (e.g., tags, geo-locations, etc.) associated with the publicly available photos for mining more semantically consistent but visually diverse training images around the world. Experimenting in the benchmarks, we demonstrate that our work can successfully acquire effective training images for learning generic facial attributes, where the classification error is relatively reduced up to 23.35% compared with that of the text-based approach and shown comparable with that of costly manual annotations. (All of the face images presented in this paper except the training images in Fig. 8 attribute to various Flickr users under Creative Commons Licenses).

Keywords

face recognition; learning (artificial intelligence); video surveillance; automatic training image acquisition; community-contributed photos; discriminable visual features; diverse training images; facial attribute detection; feature selection; large-scale media; supervised learning; text-based approach; textual relevance; unlabeled images; Context; Detectors; Feature extraction; Manuals; Noise measurement; Training; Visualization; Facial attribute detection; feature selection; training image acquisition;

fLanguage

English

Journal_Title

Multimedia, IEEE Transactions on

Publisher

ieee

ISSN

1520-9210

Type

jour

DOI

10.1109/TMM.2013.2250492

Filename

6472796