DocumentCode
177219
Title
Mixture of topic model for multi-document summarization
Author
Liu Na ; Li Ming-Xia ; Lu Ying ; Tang Xiao-jun ; Wang Hai-Wen ; Xiao Peng
Author_Institution
Sch. of Inf. Sci. & Eng., Dalian Polytech. Univ., Dalian, China
fYear
2014
fDate
May 31 2014-June 2 2014
Firstpage
5168
Lastpage
5172
Abstract
Based on LDA(Latent Dirichlet Allocation) topic model, a generative model for multi-document summarization, namely Titled-LDA that simultaneously models the content of documents and the titles of document is proposed. This generative model represents each document with a mixture of topics, and extends these approaches to title modeling by allowing the mixture weights for topics to be determined by the titles of the document. In the mixing stage, the algorithm can learn the weight in an adaptive asymmetric learning way based on two kinds of information entropies. In this way, the final model incorporated the title information and the content information appropriately, which helped the performance of summarization. The experiments showed that the proposed algorithm achieved better performance compared the other state-of-the-art algorithms on DUC2002 corpus.
Keywords
entropy; learning (artificial intelligence); text analysis; adaptive asymmetric learning; content information; document content; document titles; generative model; information entropies; latent Dirichlet allocation topic model; mixing stage; multidocument summarization; summarization performance; title information; titled-LDA; topic mixture weights; topic model mixture; Adaptation models; Computational linguistics; Computational modeling; Data mining; Information entropy; Mathematical model; Resource management; LDA; multi-document summarization; topic model;
fLanguage
English
Publisher
ieee
Conference_Titel
Control and Decision Conference (2014 CCDC), The 26th Chinese
Conference_Location
Changsha
Print_ISBN
978-1-4799-3707-3
Type
conf
DOI
10.1109/CCDC.2014.6853102
Filename
6853102
Link To Document