DocumentCode :
476213
Title :
The improvement of LDA: Considering avoiding repetition
Author :
Yuan, Bo-qiu ; Zhou, Yiming
Author_Institution :
Sch. of Comput. Sci. & Technol., Beijing Univ. of Aeronaut. & Astronaut., Beijing
Volume :
5
fYear :
2008
fDate :
12-15 July 2008
Firstpage :
2618
Lastpage :
2622
Abstract :
Topic models are increasingly studied in summarization and other application of discrete data. Though latent Dirichlet allocation (LDA) is one of the widely-used topic models for textual and image data, its Dirichlet distribution does not capture correlations between topics very well. To overcome the drawback, directed acyclic graph (DAG) and other algebra distribution, such as logistic normal distribution, were used to describe the correlations between topics. They are effective but relatively expensive. Avoiding repetition is a regular rule in English document, which was ignored in previous related works. We proposed a less expensive amend LDA model considering the avoiding repetition as a kind of topic correlations. We introduced the principium and concept of amend model based on related basic works at first. Then we describe the additive functions in details. We report the result of the adjusted model in ad hoc IR experiment, which showed that the amend model outperform the basic LDA model. Finally, the influence of some model parameters was analysis briefly.
Keywords :
computer vision; directed graphs; English document; ad hoc IR experiment; algebra distribution; directed acyclic graph; image data; latent Dirichlet allocation; latent dirichlet allocation; logistic normal distribution; model parameters; Algebra; Application software; Biological system modeling; Computer science; Cybernetics; Gaussian distribution; Linear discriminant analysis; Logistics; Machine learning; Space technology; Avoid repetition; Correlation; LDA;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2008 International Conference on
Conference_Location :
Kunming
Print_ISBN :
978-1-4244-2095-7
Electronic_ISBN :
978-1-4244-2096-4
Type :
conf
DOI :
10.1109/ICMLC.2008.4620850
Filename :
4620850
Link To Document :
بازگشت