DocumentCode :
3672610
Title :
TVSum: Summarizing web videos using titles
Author :
Yale Song;Jordi Vallmitjana;Amanda Stent;Alejandro Jaimes
Author_Institution :
Yahoo Labs, New York, USA
fYear :
2015
fDate :
6/1/2015 12:00:00 AM
Firstpage :
5179
Lastpage :
5187
Abstract :
Video summarization is a challenging problem in part because knowing which part of a video is important requires prior knowledge about its main topic. We present TVSum, an unsupervised video summarization framework that uses title-based image search results to find visually important shots. We observe that a video title is often carefully chosen to be maximally descriptive of its main topic, and hence images related to the title can serve as a proxy for important visual concepts of the main topic. However, because titles are free-formed, unconstrained, and often written ambiguously, images searched using the title can contain noise (images irrelevant to video content) and variance (images of different topics). To deal with this challenge, we developed a novel co-archetypal analysis technique that learns canonical visual concepts shared between video and images, but not in either alone, by finding a joint-factorial representation of two data sets. We introduce a new benchmark dataset, TVSum50, that contains 50 videos and their shot-level importance scores annotated via crowdsourcing. Experimental results on two datasets, SumMe and TVSum50, suggest our approach produces superior quality summaries compared to several recently proposed approaches.
Keywords :
"Videos","Yttrium","Visualization","Optimization","Approximation methods","Crowdsourcing","Focusing"
Publisher :
ieee
Conference_Titel :
Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on
Electronic_ISBN :
1063-6919
Type :
conf
DOI :
10.1109/CVPR.2015.7299154
Filename :
7299154
Link To Document :
بازگشت