Abstract :
Multimedia content such as images and texts with similar semantic meanings are always used together. Therefore, to utilize the information shared by multi-modal objects, cross-media retrieval is becoming increasingly crucial. This area concerns problems that query and results are of different media types. Existing methods either neglect correlations between entities of different media types, or suffer low performances when adopting correlation analysis and facing queries out of dataset. In this paper, we present cluster-based correlation analysis (CBCA) to exploit the correlation between different types of multimedia objects, and to measure heterogeneous semantic similarities. Based on a collection of multimedia documents (MMD), CBCA first perform clustering on uni-media feature spaces to produce several semantic clusters for each modality. After that, by using the co-occurrence information of semantic clusters of different modalities, CBCA constructs a cross-modal cluster graph (CMCG) to represent the similarities between clusters. Our proposed CBCA exploits semantic meanings of a finer granularity by clustering, mines semantic correlation between clusters instead of multimedia objects. Compared with state-of-art methods, experiments on Sina Weibo dataset show the effectiveness of CBCA.
Keywords :
data mining; document handling; graph theory; multimedia systems; pattern clustering; query processing; CBCA; CMCG; MMD; Sina Weibo dataset; cluster-based correlation analysis; cross-media retrieval; cross-modal cluster graph; heterogeneous semantic similarity measurement; multimedia content; multimedia documents; multimedia objects; multimodal objects; query; semantic cluster co-occurrence information; semantic correlation mining; uni-media feature spaces; cluster analysis; correlation analysis; cross-media retrieval; multimedia;