Title :
Pseudo-Supervised Latent Dirichlet Allocation for Image Annotation
Author :
Huong Thi Pham;Seungjin Choi
Author_Institution :
Xeron Healthcare, South Korea
Abstract :
Latent Dirichlet allocation (LDA) is a generative probabilistic model of discrete data, where each observed item is represented as a finite mixture over latent topics. Several multi-modal extensions of LDA to model annotated data are available for image annotation. Most of existing methods model the joint distribution of image features and caption texts, in order to capture statistical correlations between the two modalities, introducing an association module to correlate two sets of hidden topics. In this paper we present an alternative probabilistic model, referred to as pseudo-supervised LDA (psLDA), for image annotation, where we directly explore the caption topics to train the image model. Our model consists of two LDAs, each of which corresponds to caption model and image model, respectively, which are trained individually. However, empirical frequencies of the topics in the caption model are served as pseudo-labels for the image model, so that image and caption models are correlated via these pseudo-labels, instead of via latent variables as in most of existing methods. Numerical experiments on 2688-image Label Me dataset demonstrate the outstanding performance of psLDA, compared to existing methods such as corresponding LDA (cLDA) and topic-regression multi-modal LDA (trmmLDA), as measured by caption perplexity.
Keywords :
"Numerical models","Probabilistic logic","Computational modeling","Yttrium","Resource management","Data models","Mathematical model"
Conference_Titel :
Systems, Man, and Cybernetics (SMC), 2015 IEEE International Conference on
DOI :
10.1109/SMC.2015.336