DocumentCode
179746
Title
Supervised multi-modal topic model for image annotation
Author
Thu Hoai Tran ; Seungjin Choi
Author_Institution
Dept. of Comput. Sci. & Eng., POSTECH, Pohang, South Korea
fYear
2014
fDate
4-9 May 2014
Firstpage
5979
Lastpage
5983
Abstract
Multi-modal topic models are probabilistic generative models where hidden topics are learned from data of different types. In this paper we present supervised multi-modal latent Dirichlet allocation (smmLDA), where we incorporate class label (global description) into the joint modeling of visual words and caption words (local description), for image annotation task. We derive variational inference algorithm to approximately compute posterior distribution over latent variables. Experiments on a subset of LabelMe dataset demonstrate the useful behavior of our model, compared to existing topic models.
Keywords
image processing; inference mechanisms; variational techniques; caption words modeling; global description; image annotation; labelme dataset subset; latent variables; local description; posterior distribution computation; probabilistic generative models; smmLDA; supervised multimodal latent Dirichlet allocation; supervised multimodal topic model; variational inference algorithm; visual words modeling; Bayes methods; Computational modeling; Computer vision; Data models; Joints; Resource management; Visualization; Image annotation; latent Dirichlet allocation; topic models;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location
Florence
Type
conf
DOI
10.1109/ICASSP.2014.6854751
Filename
6854751
Link To Document