DocumentCode :
3518706
Title :
Audiovisual celebrity recognition in unconstrained web videos
Author :
Sargin, Mehmet Emre ; Aradhye, Hrishikesh ; Moreno, Pedro J. ; Zhao, Ming
Author_Institution :
Google Inc., Mountain View, CA
fYear :
2009
fDate :
19-24 April 2009
Firstpage :
1977
Lastpage :
1980
Abstract :
The number of video clips available online is growing at a tremendous pace. Conventionally, user-supplied metadata text, such as the title of the video and a set of keywords, has been the only source of indexing information for user-uploaded videos. Automated extraction of video content for unconstrained and large scale video databases is a challenging and yet unsolved problem. In this paper, we present an audiovisual celebrity recognition system towards automatic tagging of unconstrained Web videos. Prior work on audiovisual person recognition relied on the fact that the person in the video is speaking and the features extracted from audio and visual domain are associated with each other throughout the video. However, this assumption is not valid on unconstrained Web videos. Proposed method finds the audiovisual mapping and hence improve upon the association assumption. Considering the scale of the application, all pieces of the system are trained automatically without any human supervision. We present the results on 26,000 videos and show the effectiveness of the method per-celebrity basis.
Keywords :
Internet; face recognition; feature extraction; speaker recognition; video signal processing; audiovisual celebrity recognition; audiovisual mapping; audiovisual person recognition; automated video content extraction; automatic tagging; face recognition; feature extraction; metadata text; per-celebrity basis; speaker recognition; unconstrained Web video; user-uploaded video; video clip; Automatic speech recognition; Biometrics; Data mining; Face detection; Face recognition; Indexing; Internet; Speaker recognition; Videos; Working environment noise; Face recognition; Speaker recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
ISSN :
1520-6149
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2009.4959999
Filename :
4959999
Link To Document :
بازگشت