Title :
An efficient text classification rule extraction method based on value and rough set
Author :
Ye Wang ; Ming-Chun Wang
Author_Institution :
Manage. Sch., Tianjin Univ.
Abstract :
In this paper we propose a text classification rule extraction method, which is more efficient and more practical than existing similar methods. The definition of a proximate rule is first by given based on the characteristic of text classification rule extraction. Based on the chi values, the features of text set are selected and feature significance information is provided for the further feature selection. Then rough set is used to further reduce the features on the discrete decision table. Finally precise rules or proximate rules are extracted by using rough set theory. The method combines an improved chi2 value feature selection and rough set theory fully so as to avoid both feature reduction on a large scale decision table and the discretization of the decision table. The method greatly improves the efficiency and the practicability of the extracted text rule. Experimental results demonstrate the effectiveness of the method
Keywords :
classification; decision tables; feature extraction; knowledge acquisition; rough set theory; text analysis; chi value; decision table; feature selection; rough set theory; text classification rule extraction method; Text categorization;
Conference_Titel :
Machine Learning and Cybernetics, 2006 International Conference on
Conference_Location :
Dalian, China
Print_ISBN :
1-4244-0061-9
DOI :
10.1109/ICMLC.2006.258827