• DocumentCode
    2084539
  • Title

    An adaptive Markov model for text categorization

  • Author

    Li, Jin ; Yue, Kun ; Liu, WeiYi

  • Author_Institution
    Sch. of Software, Yunnan Univ., China
  • Volume
    1
  • fYear
    2008
  • fDate
    17-19 Nov. 2008
  • Firstpage
    802
  • Lastpage
    807
  • Abstract
    Existing methods for text categorization assume that a document is a bag of words. While computationally efficient, such a representation is unable to capture sequential information. In this paper, a document is looked upon as a sequence of characters or words and the preprocessing for text categorization, such as word segmentation and feature selection, is not demanded. Statistical dependencies among the neighboring terms of a sequence are captured by different order Markov models. We proposed a sequence classification methods based on adaptive Markov model. Our method blends the Markov models with different order values together for text categorization automatically and effectively. We present an extensive experimental evaluation of our method on an English collections and one Chinese corpus. The results show the high recall and precision of our method.
  • Keywords
    Markov processes; pattern classification; text analysis; adaptive Markov model; sequence classification methods; text categorization; Classification tree analysis; Context modeling; Frequency; Information science; Intelligent systems; Knowledge engineering; Probability; Search engines; Text categorization; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent System and Knowledge Engineering, 2008. ISKE 2008. 3rd International Conference on
  • Conference_Location
    Xiamen
  • Print_ISBN
    978-1-4244-2196-1
  • Electronic_ISBN
    978-1-4244-2197-8
  • Type

    conf

  • DOI
    10.1109/ISKE.2008.4731039
  • Filename
    4731039