• DocumentCode
    2425646
  • Title

    Text Categorization Method Based on Improved Mutual Information and Characteristic Weights Evaluation Algorithms

  • Author

    Pei, Zhili ; Shi, Xiaohu ; Marchese, Maurizio ; Liang, Yanchun

  • Author_Institution
    Jilin Univ., Changchun
  • Volume
    4
  • fYear
    2007
  • fDate
    24-27 Aug. 2007
  • Firstpage
    87
  • Lastpage
    91
  • Abstract
    The improvement of text categorization by statistical methods can be performed from two main directions, namely the feature selection and the evaluation of characteristic weights. In this paper, we propose an enhanced text categorization method based on a modified mutual information algorithm and evaluation algorithm of characteristic weights which improves both aspects. The proposed method is applied to the benchmark test set Reuters-21578 Top10 to examine its effectiveness. Numerical results show that the precision, the recall and the value of F1 of the proposed method are all superior to those of existing conventional methods.
  • Keywords
    statistical analysis; text analysis; benchmark test set Reuters-21578 Top10; characteristic weights evaluation algorithms; feature selection; mutual information algorithm; statistical methods; text categorization method; Communications technology; Computer science; Educational institutions; Frequency estimation; Frequency shift keying; Mutual information; Performance evaluation; Statistical analysis; Testing; Text categorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on
  • Conference_Location
    Haikou
  • Print_ISBN
    978-0-7695-2874-8
  • Type

    conf

  • DOI
    10.1109/FSKD.2007.559
  • Filename
    4406359