• DocumentCode
    41785
  • Title

    Efficient Heuristic Methods for Multimodal Fusion and Concept Fusion in Video Concept Detection

  • Author

    Jie Geng ; Zhenjiang Miao ; Xiao-Ping Zhang

  • Author_Institution
    Inst. of Inf. Sci., Beijing Jiaotong Univ., Beijing, China
  • Volume
    17
  • Issue
    4
  • fYear
    2015
  • fDate
    Apr-15
  • Firstpage
    498
  • Lastpage
    511
  • Abstract
    Semantic models are widely used to bridge the semantic gap between low-level features and high-level features in video concept indexing. Multimodal fusion and concept fusion are two commonly used approaches in building semantic models. In the previous work, domain adaptation is neglected in multimodal fusion, and many probability maximization based and unsupervised concept fusion methods are counterintuitive since they do not incorporate subjective human intuition. In this paper, we present a new two-stage semantic model combining the multimodal fusion and the concept fusion incorporating human heuristics. In the multimodal fusion model, we employ a new generic unsupervised method, namely, domain adaptive linear combination (DALC), to update the linear combination (LC) weights by incorporating the differences of element distributions between training and testing domains. In the concept fusion model, a novel mechanical node equilibrium (NE) model is developed by using forces to model the concept correlations to update the score of concepts represented by nodes. It is intuitive and can incorporate multiple kinds of correlations simultaneously to construct more sophisticated semantic structure. Compared to other state-of-the-art supervised and unsupervised methods, the new model can use either unsupervised or supervised factors to significantly improve the mean inferred average precision (MAP) performance on all datasets.
  • Keywords
    image fusion; indexing; optimisation; probability; unsupervised learning; video signal processing; DALC; LC weights; MAP performance; NE model; concept fusion model; domain adaptive linear combination; heuristic methods; high-level features; linear combination weights; low-level features; mean inferred average precision performance; mechanical node equilibrium model; multimodal fusion model; probability maximization; semantic structure; supervised methods; two-stage semantic model; unsupervised concept fusion methods; video concept detection; video concept indexing; Adaptation models; Correlation; Detectors; Histograms; Indexing; Semantics; Vectors; Concept fusion; domain adaption; multimodal fusion; video concept indexing;
  • fLanguage
    English
  • Journal_Title
    Multimedia, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1520-9210
  • Type

    jour

  • DOI
    10.1109/TMM.2015.2398195
  • Filename
    7027217