DocumentCode :
2267847
Title :
Classifying Documents with Maximum Likelihood Approximation of the Dirichlet Multinomial Gibbs Model
Author :
Zhou, Shibin ; Cao, Zhao ; Liu, Yushu
Author_Institution :
Sch. of Comput. Sci. & Technol., Beijing Inst. of Technol., Beijing
Volume :
3
fYear :
2008
fDate :
20-22 Dec. 2008
Firstpage :
71
Lastpage :
75
Abstract :
In the text analysis, the Dirichlet compound multinomial (DCM)distribution has recently been shown to be a good model for documents because it captures the phenomenon of word burstiness, unlike the standard multinomial distribution. In this paper, for the sake of improving performance of modeling documents, we propose a variant of DCM and Gibbs distribution called Dirichlet multinomial Gibbs (DMG) model by introducing Gibbs parameters to DCM distribution. We demonstrate the maximum likelihood procedure of the DMG model with these Gibbs parameters. By our experiments, the DMG approach inherit the merits of methods of Gibbs distribution approximation and DCM estimation. More specifically, as revealed by our experimental results on various real-world text datasets, we show that maximum likelihood approximation of the DMG model is more desirable than some current state-of-the-art methods.
Keywords :
classification; maximum likelihood estimation; statistical distributions; text analysis; DMG model; Dirichlet compound multinomial distribution; Dirichlet multinomial Gibbs model; document classification; maximum likelihood approximation; text analysis; word burstiness phenomenon; Application software; Approximation methods; Computer science; Entropy; Frequency; Information technology; Maximum likelihood estimation; Testing; Text analysis; Text categorization; Document classification; Maximum Likelihood;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Information Technology Application, 2008. IITA '08. Second International Symposium on
Conference_Location :
Shanghai
Print_ISBN :
978-0-7695-3497-8
Type :
conf
DOI :
10.1109/IITA.2008.307
Filename :
4739961
Link To Document :
بازگشت