• DocumentCode
    155669
  • Title

    Sparse topic models by parameter sharing

  • Author

    Soleimani, Hossein ; Miller, David J.

  • Author_Institution
    Dept. of Electr. Eng., Pennsylvania State Univ., University Park, PA, USA
  • fYear
    2014
  • fDate
    21-24 Sept. 2014
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    We propose a sparse Bayesian topic model, based on parameter sharing, for modeling text corpora. In Latent Dirichlet Allocation (LDA), each topic models all words, even though many words are not topic-specific, i.e. have similar occurrence frequencies across different topics. We propose a sparser approach by introducing a universal shared model, used by each topic to model the subset of words that are not topic-specific. A Bernoulli random variable is associated with each word under every topic, determining whether that word is modeled topic-specifically, with a free parameter, or by the shared model, with a common parameter. Results of our experiments show that our model achieves sparser topic presence in documents and higher test likelihood than LDA.
  • Keywords
    Bayes methods; maximum likelihood estimation; text analysis; Bayesian topic model; Bernoulli random variable; LDA; documents; latent dirichlet allocation; parameter sharing; sparse topic models; sparser topic presence; test likelihood; text corpora modeling; universal shared model; Abstracts; Lead; Parameter estimation; Resource management; Sparse models; Topic models; Variational inference;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning for Signal Processing (MLSP), 2014 IEEE International Workshop on
  • Conference_Location
    Reims
  • Type

    conf

  • DOI
    10.1109/MLSP.2014.6958911
  • Filename
    6958911