DocumentCode :
3430753
Title :
The effectiveness Of redundant information in Text Classification
Author :
Xu, Yan ; Qiu, Yongqin ; Zhao, Xiaodan
Author_Institution :
Information and Science College Beijing Language and Culture University, China
fYear :
2012
fDate :
11-13 Aug. 2012
Firstpage :
579
Lastpage :
584
Abstract :
Feature Selection plays a important role in Text Classification. When computing the score of attributes of the texts, many researchers will use the information gain measure (IG) and document frequency (DF). Many experiments proved that this methods can be two of the most effective methods. But what is the core factor in these algorithms haven´t been proved from the theory. This paper contains the method of proof that the relations of attributes plays an important role in Feature Selection by using the theory of Exact Cover. During the selection, we using the Dance Link (DLX) algorithm and the theory of Association Cluster to reduce the redundant information. The results from our experiment show that reduction of the associating attributes can improve the Recall rate greatly but have a bad influence on Precision.
Keywords :
Organizations; Dance Link; association; exact cover; feature selection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Granular Computing (GrC), 2012 IEEE International Conference on
Conference_Location :
Hangzhou, China
Print_ISBN :
978-1-4673-2310-9
Type :
conf
DOI :
10.1109/GrC.2012.6468589
Filename :
6468589
Link To Document :
بازگشت