DocumentCode :
1238939
Title :
A unified framework for semantic shot classification in sports video
Author :
Duan, Ling-Yu ; Xu, Min ; Tian, Qi ; Xu, Chang-Sheng ; Jin, Jesse S.
Author_Institution :
Inst. for Infocomm Res., Singapore
Volume :
7
Issue :
6
fYear :
2005
Firstpage :
1066
Lastpage :
1083
Abstract :
The extensive amount of multimedia information available necessitates content-based video indexing and retrieval methods. Since humans tend to use high-level semantic concepts when querying and browsing multimedia databases, there is an increasing need for semantic video indexing and analysis. For this purpose, we present a unified framework for semantic shot classification in sports video, which has been widely studied due to tremendous commercial potentials. Unlike most existing approaches, which focus on clustering by aggregating shots or key-frames with similar low-level features, the proposed scheme employs supervised learning to perform a top-down video shot classification. Moreover, the supervised learning procedure is constructed on the basis of effective mid-level representations instead of exhaustive low-level features. This framework consists of three main steps: 1) identify video shot classes for each sport; 2) develop a common set of motion, color, shot length-related mid-level representations; and 3) supervised learning of the given sports video shots. It is observed that for each sport we can predefine a small number of semantic shot classes, about 5-10, which covers 90%-95% of broadcast sports video. We employ nonparametric feature space analysis to map low-level features to mid-level semantic video shot attributes such as dominant object (a player) motion, camera motion patterns, and court shape, etc. Based on the fusion of those mid-level shot attributes, we classify video shots into the predefined shot classes, each of which has clear semantic meanings. With this framework we have achieved good classification accuracy of 85%-95% on the game videos of five typical ball type sports (i.e., tennis, basketball, volleyball, soccer, and table tennis) with over 5500 shots of about 8 h. With correctly classified sports video shots, further structural and temporal analysis, such as event detection, highlight extraction, video skimming, and table of content, will be greatly facilitated.
Keywords :
content-based retrieval; database indexing; learning (artificial intelligence); semantic networks; sport; video databases; video retrieval; content-based video retrieval; nonparametric feature space analysis; semantic gap; semantic shot classification; semantic video indexing; shot representation; sports video; supervised learning; video classification; video databases indexing; Broadcasting; Content based retrieval; Humans; Indexing; Information retrieval; Motion analysis; Multimedia communication; Multimedia databases; Pattern analysis; Supervised learning; Semantic gap; shot representation; shot similarity; video classification; video databases indexing;
fLanguage :
English
Journal_Title :
Multimedia, IEEE Transactions on
Publisher :
ieee
ISSN :
1520-9210
Type :
jour
DOI :
10.1109/TMM.2005.858395
Filename :
1542084
Link To Document :
بازگشت