An efficient text classification rule extraction method based on value and rough set

Author

Ye Wang ; Ming-Chun Wang

Author_Institution

Manage. Sch., Tianjin Univ.

fYear

2006

fDate

13-16 Aug. 2006

Firstpage

1552

Lastpage

1557

Abstract

In this paper we propose a text classification rule extraction method, which is more efficient and more practical than existing similar methods. The definition of a proximate rule is first by given based on the characteristic of text classification rule extraction. Based on the chi values, the features of text set are selected and feature significance information is provided for the further feature selection. Then rough set is used to further reduce the features on the discrete decision table. Finally precise rules or proximate rules are extracted by using rough set theory. The method combines an improved chi² value feature selection and rough set theory fully so as to avoid both feature reduction on a large scale decision table and the discretization of the decision table. The method greatly improves the efficiency and the practicability of the extracted text rule. Experimental results demonstrate the effectiveness of the method

Keywords

classification; decision tables; feature extraction; knowledge acquisition; rough set theory; text analysis; chi value; decision table; feature selection; rough set theory; text classification rule extraction method; Text categorization;

fLanguage

English

Publisher

ieee

Conference_Titel

Machine Learning and Cybernetics, 2006 International Conference on

Conference_Location

Dalian, China

Print_ISBN

1-4244-0061-9

Type

conf

DOI

10.1109/ICMLC.2006.258827

Filename

4028311