Title :
Exploiting Visual-Audio-Textual Characteristics for Automatic TV Commercial Block Detection and Segmentation
Author :
Nan Liu ; Yao Zhao ; Zhenfeng Zhu ; Hanqing Lu
Author_Institution :
Sch. of Comput. & Inf. Technol., Beijing Jiaotong Univ., Beijing, China
Abstract :
Automatic TV commercial block detection (CBD) and commercial block segmentation (CBS) are two key components of a smart commercial digesting system. In this paper, we focus our research on CBD and CBS by the means of collaborative exploitation of visual-audio-textual characteristics embedded in commercials. Rather than utilizing exclusively visual-audio characteristics like most previous works, an abundance of textual characteristics associated with commercials are fully exploited. Additionally, Tri-AdaBoost, an interactive ensemble learning manner, is proposed to form a consolidated semantic fusion across visual, audio, and textual characteristics. In order to segment a detected commercial block into multiple individual commercials, additional informative descriptors including textual characteristics are introduced to boost the robustness in the detection of frame marked with product information (FMPI). Together with the characteristics of audio spectral variation pointer and silent position, FMPI can provide a kind of complementary representation architecture to model the similarity of intra-commercial and the dissimilarity of inter-commercial. Experiments are conducted on a large video dataset from both China central television (CCTV) channels and TRECVID´05, and promising experimental results show the effectiveness of the proposed scheme.
Keywords :
image segmentation; learning (artificial intelligence); media streaming; multimedia computing; object detection; CBD; CBS; China central television channels; FMPI; TRECVID´05; Tri-AdaBoost; audio spectral variation pointer; automatic TV commercial block detection; automatic TV commercial block segmentation; frame marked with product information detection; interactive ensemble learning; silent position; smart commercial digesting system; visual-audio-textual characteristics; Feature extraction; Indexing; Information science; Robustness; Semantics; TV; Visualization; Commercial detection; commercial segmentation; multi-modal fusion; text detection; video analysis;
Journal_Title :
Multimedia, IEEE Transactions on
DOI :
10.1109/TMM.2011.2160334