DocumentCode
155669
Title
Sparse topic models by parameter sharing
Author
Soleimani, Hossein ; Miller, David J.
Author_Institution
Dept. of Electr. Eng., Pennsylvania State Univ., University Park, PA, USA
fYear
2014
fDate
21-24 Sept. 2014
Firstpage
1
Lastpage
6
Abstract
We propose a sparse Bayesian topic model, based on parameter sharing, for modeling text corpora. In Latent Dirichlet Allocation (LDA), each topic models all words, even though many words are not topic-specific, i.e. have similar occurrence frequencies across different topics. We propose a sparser approach by introducing a universal shared model, used by each topic to model the subset of words that are not topic-specific. A Bernoulli random variable is associated with each word under every topic, determining whether that word is modeled topic-specifically, with a free parameter, or by the shared model, with a common parameter. Results of our experiments show that our model achieves sparser topic presence in documents and higher test likelihood than LDA.
Keywords
Bayes methods; maximum likelihood estimation; text analysis; Bayesian topic model; Bernoulli random variable; LDA; documents; latent dirichlet allocation; parameter sharing; sparse topic models; sparser topic presence; test likelihood; text corpora modeling; universal shared model; Abstracts; Lead; Parameter estimation; Resource management; Sparse models; Topic models; Variational inference;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning for Signal Processing (MLSP), 2014 IEEE International Workshop on
Conference_Location
Reims
Type
conf
DOI
10.1109/MLSP.2014.6958911
Filename
6958911
Link To Document