DocumentCode :
2460854
Title :
Leveraging archival video for building face datasets
Author :
Ramanan, Deva ; Baker, Simon ; Kakade, Sham
Author_Institution :
Toyota Technol. Inst. at Chicago, Chicago
fYear :
2007
fDate :
14-21 Oct. 2007
Firstpage :
1
Lastpage :
8
Abstract :
We introduce a semi-supervised method for building large, labeled datasets effaces by leveraging archival video. Specifically, we have implemented a system for labeling 11 years worth of archival footage from a television show. We have compiled a dataset of 611,770 faces, orders of magnitude larger than existing collections. It includes variation in appearance due to age, weight gain, changes in hairstyles, and other factors difficult to observe in smaller-scale collections. Face recognition in an uncontrolled setting can be difficult. We argue (and demonstrate) that there is much structure at varying timescales in the video data that make recognition much easier. At local time scales, one can use motion and tracking to group face images together - we may not know the identity, but we know a single label applies to all faces in a track. At medium time scales (say, within a scene), one can use appearance features such as hair and clothing to group tracks across shot boundaries. However, at longer timescales (say, across episodes), one can no longer use clothing as a cue. This suggests that one needs to carefully encode representations of appearance, depending on the timescale at which one intends to match. We assemble our final dataset by classifying groups of tracks in a nearest-neighbors framework. We use a face library obtained by labeling track clusters in a reference episode. We show that this classification is significantly easier when exploiting the hierarchical structure naturally present in the video sequences. From a data-collection point of view, tracking is vital because it adds non-frontal poses to our face collection. This is important because we know of no other method for collecting images of non-frontal faces "in the wild".
Keywords :
face recognition; image classification; archival video; building face datasets; face recognition; nearest-neighbors framework; semisupervised method; Assembly; Clothing; Face recognition; Hair; Labeling; Layout; Libraries; TV; Tracking; Video sequences;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on
Conference_Location :
Rio de Janeiro
ISSN :
1550-5499
Print_ISBN :
978-1-4244-1630-1
Electronic_ISBN :
1550-5499
Type :
conf
DOI :
10.1109/ICCV.2007.4409012
Filename :
4409012
Link To Document :
بازگشت