• DocumentCode
    2299526
  • Title

    Multi-modal characteristics analysis and fusion for TV commercial detection

  • Author

    Liu, Nan ; Zhao, Yao ; Zhu, Zhenfeng ; Lu, Hanqing

  • Author_Institution
    Inst. of Inf. Sci., Beijing Jiaotong Univ., Beijing, China
  • fYear
    2010
  • fDate
    19-23 July 2010
  • Firstpage
    831
  • Lastpage
    836
  • Abstract
    Automatic TV commercial detection has become an indispensable part of content-based video analysis technique due to the explosive growth in TV commercial volume. In this paper, a multi-modal (i.e. visual, audio and textual modalities) commercial digesting scheme is proposed to alleviate two challenges in commercial detection, which are the generation of mid-level semantic descriptor and the application of effective discrimination method. Compared with the general program, some unique semantic characteristics are purposely embedded in the commercial to grasp more attention from audience. Aiming at exploring the power of these semantic characteristics, a kind of novel commercial-oriented descriptor from textual modality is proposed, besides taking advantage of those commonly used description means in light of audio and visual modalities. To boost the ability of discrimination of commercial from general program in multi-modal representation space, Tri-AdaBoost, a self-learning method by an interactive way across multiple modalities, is introduced to form a final consolidated decision for discrimination. Moreover, a heuristic post processing strategy based on the temporal consistency is taken to further reduce the false alarms. The promising experimental results show the effectiveness of the proposed scheme with respect to large video data collections.
  • Keywords
    image fusion; learning (artificial intelligence); object detection; video signal processing; TV commercial detection; Tri-AdaBoost self-learning method; audio modality; content-based video analysis; discrimination method; mid-level semantic descriptor; multimodal characteristic analysis; multimodal characteristic fusion; multimodal commercial digesting scheme; textual modality; visual modality; Accuracy; Error analysis; Feature extraction; Semantics; TV; Training; Visualization; Commercial Detection; Mid-Level Descriptor; Multimedia Analysis; Tri-AdaBoost; Video Categorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia and Expo (ICME), 2010 IEEE International Conference on
  • Conference_Location
    Suntec City
  • ISSN
    1945-7871
  • Print_ISBN
    978-1-4244-7491-2
  • Type

    conf

  • DOI
    10.1109/ICME.2010.5583867
  • Filename
    5583867