Title :
A Cross-Modal Approach for Extracting Semantic Relationships Between Concepts Using Tagged Images
Author :
Katsurai, Makoto ; Ogawa, Tomomi ; Haseyama, Miki
Author_Institution :
Grad. Sch. of Inf. Sci. & Technol., Hokkaido Univ., Sapporo, Japan
Abstract :
This paper presents a cross-modal approach for extracting semantic relationships between concepts using tagged images. In the proposed method, we first project both text and visual features of the tagged images to a latent space using canonical correlation analysis (CCA). Then, under the probabilistic interpretation of CCA, we calculate a representative distribution of the latent variables for each concept. Based on the representative distributions of the concepts, we derive two types of measures: the semantic relatedness between the concepts and the abstraction level of each concept. Because these measures are derived from a cross-modal scheme that enables the collaborative use of both text and visual features, the semantic relationships can successfully reflect semantic and visual contexts. Experiments conducted on tagged images collected from Flickr show that our measures are more coherent to human cognition than the conventional measures that use either text or visual features, or the WordNet-based measures. In particular, a new measure of semantic relatedness, which satisfies the triangle inequality, obtains the best results among different distance measures in our framework. The applicability of our measures to multimedia-related tasks such as concept clustering, image annotation and tag recommendation is also shown in the experiments.
Keywords :
Web sites; correlation methods; database management systems; feature extraction; multimedia communication; natural language processing; statistical analysis; CCA probabilistic interpretation; Flickr; WordNet-based measures; abstraction level; canonical correlation analysis; concept clustering; cross-modal scheme; distance measures; human cognition; image annotation; latent space; multimedia-related tasks; semantic contexts; semantic relatedness; semantic relationship extraction; tag recommendation; tagged images; text features; triangle inequality; visual contexts; visual features; Atmospheric measurements; Biomedical measurement; Feature extraction; Particle measurements; Probabilistic logic; Semantics; Visualization; Canonical correlation analysis; concept relationships; flickr; tagged images;
Journal_Title :
Multimedia, IEEE Transactions on
DOI :
10.1109/TMM.2014.2306655