Multi-task deep visual-semantic embedding for video thumbnail selection

Author

Wu Liu;Tao Mei;Yongdong Zhang;Cherry Che;Jiebo Luo

Author_Institution

Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China

fYear

2015

fDate

6/1/2015 12:00:00 AM

Firstpage

3707

Lastpage

3715

Abstract

Given the tremendous growth of online videos, video thumbnail, as the common visualization form of video content, is becoming increasingly important to influence user´s browsing and searching experience. However, conventional methods for video thumbnail selection often fail to produce satisfying results as they ignore the side semantic information (e.g., title, description, and query) associated with the video. As a result, the selected thumbnail cannot always represent video semantics and the click-through rate is adversely affected even when the retrieved videos are relevant. In this paper, we have developed a multi-task deep visual-semantic embedding model, which can automatically select query-dependent video thumbnails according to both visual and side information. Different from most existing methods, the proposed approach employs the deep visual-semantic embedding model to directly compute the similarity between the query and video thumbnails by mapping them into a common latent semantic space, where even unseen query-thumbnail pairs can be correctly matched. In particular, we train the embedding model by exploring the large-scale and freely accessible click-through video and image data, as well as employing a multi-task learning strategy to holistically exploit the query-thumbnail relevance from these two highly related datasets. Finally, a thumbnail is selected by fusing both the representative and query relevance scores. The evaluations on 1,000 query-thumbnail dataset labeled by 191 workers in Amazon Mechanical Turk have demonstrated the effectiveness of our proposed method.

Keywords

"Semantics","Visualization","Training","Computational modeling","Mathematical model","Data models","Nails"

Publisher

ieee

Conference_Titel

Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on

Electronic_ISBN

1063-6919

Type

conf

DOI

10.1109/CVPR.2015.7298994

Filename

7298994