Exploiting Web Images for Semantic Video Indexing Via Robust Sample-Specific Loss

Author

Yang Yang ; Zheng-Jun Zha ; Yue Gao ; Xiaofeng Zhu ; Tat-Seng Chua

Author_Institution

Sch. of Comput. Sci. & Eng., Univ. of Electron. Sci. & Technol. of China, Chengdu, China

Volume

16

Issue

6

fYear

2014

fDate

Oct. 2014

Firstpage

1677

Lastpage

1689

Abstract

Semantic video indexing, also known as video annotation or video concept detection in literatures, has been attracting significant attention in recent years. Due to deficiency of labeled training videos, most of the existing approaches can hardly achieve satisfactory performance. In this paper, we propose a novel semantic video indexing approach, which exploits the abundant user-tagged Web images to help learn robust semantic video indexing classifiers. The following two major challenges are well studied: 1) noisy Web images with imprecise and/or incomplete tags; and 2) domain difference between images and videos. Specifically, we first apply a non-parametric approach to estimate the probabilities of images being correctly tagged as confidence scores. We then develop a robust transfer video indexing (RTVI) model to learn reliable classifiers from a limited number of training videos together with the abundance of user-tagged images. The RTVI model is equipped with a novel sample-specific robust loss function, which employs the confidence score of a Web image as prior knowledge to suppress the influence and control the contribution of this image in the learning process. Meanwhile, the RTVI model discovers an optimal kernel space, in which the mismatch between images and videos is minimized for tackling the domain difference problem. Besides, we devise an iterative algorithm to effectively optimize the proposed RTVI model and a theoretical analysis on the convergence of the proposed algorithm is provided as well. Extensive experiments on various real-world multimedia collections demonstrate the effectiveness of the proposed robust semantic video indexing approach.

Keywords

Internet; image classification; indexing; iterative methods; learning (artificial intelligence); statistical analysis; video signal processing; RTVI model; classifier learning; confidence score; domain difference; iterative algorithm; learning process; nonparametric approach; probability estimation; robust transfer video indexing; sample-specific robust loss function; semantic video indexing; user-tagged Web images; video annotation; video concept detection; Indexing; Kernel; Noise; Noise measurement; Robustness; Semantics; Video signal processing; Robust; semantic video indexing; transfer learning;

fLanguage

English

Journal_Title

Multimedia, IEEE Transactions on

Publisher

ieee

ISSN

1520-9210

Type

jour

DOI

10.1109/TMM.2014.2323014

Filename

6813690