DocumentCode
2084539
Title
An adaptive Markov model for text categorization
Author
Li, Jin ; Yue, Kun ; Liu, WeiYi
Author_Institution
Sch. of Software, Yunnan Univ., China
Volume
1
fYear
2008
fDate
17-19 Nov. 2008
Firstpage
802
Lastpage
807
Abstract
Existing methods for text categorization assume that a document is a bag of words. While computationally efficient, such a representation is unable to capture sequential information. In this paper, a document is looked upon as a sequence of characters or words and the preprocessing for text categorization, such as word segmentation and feature selection, is not demanded. Statistical dependencies among the neighboring terms of a sequence are captured by different order Markov models. We proposed a sequence classification methods based on adaptive Markov model. Our method blends the Markov models with different order values together for text categorization automatically and effectively. We present an extensive experimental evaluation of our method on an English collections and one Chinese corpus. The results show the high recall and precision of our method.
Keywords
Markov processes; pattern classification; text analysis; adaptive Markov model; sequence classification methods; text categorization; Classification tree analysis; Context modeling; Frequency; Information science; Intelligent systems; Knowledge engineering; Probability; Search engines; Text categorization; Training data;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent System and Knowledge Engineering, 2008. ISKE 2008. 3rd International Conference on
Conference_Location
Xiamen
Print_ISBN
978-1-4244-2196-1
Electronic_ISBN
978-1-4244-2197-8
Type
conf
DOI
10.1109/ISKE.2008.4731039
Filename
4731039
Link To Document