DocumentCode
2866920
Title
An Improved X2 (CHI) Statistics Method for Text Feature Selection
Author
Yan, Tang ; Ting, Xiao
Author_Institution
Coll. of Comput. & Inf. Sci., Southwest Univ., Chongqing, China
fYear
2009
fDate
11-13 Dec. 2009
Firstpage
1
Lastpage
4
Abstract
Feature selection is a hot topic in current search field, especially in the field of text categorization. To overcome the shortcomings of traditional χ2 (CHI) approach, an improved χ2 (CHI) statistics method is proposed in this paper. It comprehensively takes criterions such as Document Frequency and Class Accuracy of the traditional statistical methods to improve χ2 (CHI) statistical method. The experiments results show that the proposed method is more effective than the traditional χ2 (CHI) method.
Keywords
data mining; statistical analysis; 2 CHI statistics method; class accuracy criterion; document frequency criterion; feature selection; text categorization; Data mining; Educational institutions; Entropy; Frequency; Information science; Mutual information; Statistical analysis; Statistics; Text categorization; Text mining;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Software Engineering, 2009. CiSE 2009. International Conference on
Conference_Location
Wuhan
Print_ISBN
978-1-4244-4507-3
Electronic_ISBN
978-1-4244-4507-3
Type
conf
DOI
10.1109/CISE.2009.5366401
Filename
5366401
Link To Document