Supervised multi-modal topic model for image annotation

Author

Thu Hoai Tran ; Seungjin Choi

Author_Institution

Dept. of Comput. Sci. & Eng., POSTECH, Pohang, South Korea

fYear

2014

fDate

4-9 May 2014

Firstpage

5979

Lastpage

5983

Abstract

Multi-modal topic models are probabilistic generative models where hidden topics are learned from data of different types. In this paper we present supervised multi-modal latent Dirichlet allocation (smmLDA), where we incorporate class label (global description) into the joint modeling of visual words and caption words (local description), for image annotation task. We derive variational inference algorithm to approximately compute posterior distribution over latent variables. Experiments on a subset of LabelMe dataset demonstrate the useful behavior of our model, compared to existing topic models.

Keywords

image processing; inference mechanisms; variational techniques; caption words modeling; global description; image annotation; labelme dataset subset; latent variables; local description; posterior distribution computation; probabilistic generative models; smmLDA; supervised multimodal latent Dirichlet allocation; supervised multimodal topic model; variational inference algorithm; visual words modeling; Bayes methods; Computational modeling; Computer vision; Data models; Joints; Resource management; Visualization; Image annotation; latent Dirichlet allocation; topic models;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location

Florence

Type

conf

DOI

10.1109/ICASSP.2014.6854751

Filename

6854751