DocumentCode :
41785
Title :
Efficient Heuristic Methods for Multimodal Fusion and Concept Fusion in Video Concept Detection
Author :
Jie Geng ; Zhenjiang Miao ; Xiao-Ping Zhang
Author_Institution :
Inst. of Inf. Sci., Beijing Jiaotong Univ., Beijing, China
Volume :
17
Issue :
4
fYear :
2015
fDate :
Apr-15
Firstpage :
498
Lastpage :
511
Abstract :
Semantic models are widely used to bridge the semantic gap between low-level features and high-level features in video concept indexing. Multimodal fusion and concept fusion are two commonly used approaches in building semantic models. In the previous work, domain adaptation is neglected in multimodal fusion, and many probability maximization based and unsupervised concept fusion methods are counterintuitive since they do not incorporate subjective human intuition. In this paper, we present a new two-stage semantic model combining the multimodal fusion and the concept fusion incorporating human heuristics. In the multimodal fusion model, we employ a new generic unsupervised method, namely, domain adaptive linear combination (DALC), to update the linear combination (LC) weights by incorporating the differences of element distributions between training and testing domains. In the concept fusion model, a novel mechanical node equilibrium (NE) model is developed by using forces to model the concept correlations to update the score of concepts represented by nodes. It is intuitive and can incorporate multiple kinds of correlations simultaneously to construct more sophisticated semantic structure. Compared to other state-of-the-art supervised and unsupervised methods, the new model can use either unsupervised or supervised factors to significantly improve the mean inferred average precision (MAP) performance on all datasets.
Keywords :
image fusion; indexing; optimisation; probability; unsupervised learning; video signal processing; DALC; LC weights; MAP performance; NE model; concept fusion model; domain adaptive linear combination; heuristic methods; high-level features; linear combination weights; low-level features; mean inferred average precision performance; mechanical node equilibrium model; multimodal fusion model; probability maximization; semantic structure; supervised methods; two-stage semantic model; unsupervised concept fusion methods; video concept detection; video concept indexing; Adaptation models; Correlation; Detectors; Histograms; Indexing; Semantics; Vectors; Concept fusion; domain adaption; multimodal fusion; video concept indexing;
fLanguage :
English
Journal_Title :
Multimedia, IEEE Transactions on
Publisher :
ieee
ISSN :
1520-9210
Type :
jour
DOI :
10.1109/TMM.2015.2398195
Filename :
7027217
Link To Document :
بازگشت