DocumentCode
2425646
Title
Text Categorization Method Based on Improved Mutual Information and Characteristic Weights Evaluation Algorithms
Author
Pei, Zhili ; Shi, Xiaohu ; Marchese, Maurizio ; Liang, Yanchun
Author_Institution
Jilin Univ., Changchun
Volume
4
fYear
2007
fDate
24-27 Aug. 2007
Firstpage
87
Lastpage
91
Abstract
The improvement of text categorization by statistical methods can be performed from two main directions, namely the feature selection and the evaluation of characteristic weights. In this paper, we propose an enhanced text categorization method based on a modified mutual information algorithm and evaluation algorithm of characteristic weights which improves both aspects. The proposed method is applied to the benchmark test set Reuters-21578 Top10 to examine its effectiveness. Numerical results show that the precision, the recall and the value of F1 of the proposed method are all superior to those of existing conventional methods.
Keywords
statistical analysis; text analysis; benchmark test set Reuters-21578 Top10; characteristic weights evaluation algorithms; feature selection; mutual information algorithm; statistical methods; text categorization method; Communications technology; Computer science; Educational institutions; Frequency estimation; Frequency shift keying; Mutual information; Performance evaluation; Statistical analysis; Testing; Text categorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on
Conference_Location
Haikou
Print_ISBN
978-0-7695-2874-8
Type
conf
DOI
10.1109/FSKD.2007.559
Filename
4406359
Link To Document