DocumentCode :
1656714
Title :
Research on Text Feature Selection Algorithm Based on Information Gain and Feature Relation Tree
Author :
Hong Zhang ; Yong-gong Ren ; Xue Yang
Author_Institution :
Sch. of Comput. & Inf. Technol., Liaoning Normal Univ., Dalian, China
fYear :
2013
Firstpage :
446
Lastpage :
449
Abstract :
The classification performance of previous IG algorithm may decline obviously because of the maldistribution of classes and features, due to which an improved text feature selection method UDsIG is proposed. First, we select features by classes to reduce the impact on feature selection when the classes are unevenly distributed. After that, we use feature equilibrium of distribution to decrease the interference with feature selection when features are unevenly distributed. And then we deal with class features by feature relation tree model, thus to retain strong correlation features. Finally, we use the improved information gain formula, which is based on weighed dispersion, to get the optimal feature subset. The experimental results show the proposed method has better classification performance.
Keywords :
feature selection; pattern classification; text analysis; IG algorithm; UDsIG; class maldistribution; classification performance; feature equilibrium; feature maldistribution; feature relation tree model; information gain; text feature selection algorithm; weighed dispersion; Classification algorithms; Computers; Correlation; Dispersion; Educational institutions; Feature extraction; Text categorization; feature relation tree; feature selection; information gain; weighed dispersion;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Information System and Application Conference (WISA), 2013 10th
Conference_Location :
Yangzhou
Print_ISBN :
978-1-4799-3218-4
Type :
conf
DOI :
10.1109/WISA.2013.90
Filename :
6778681
Link To Document :
بازگشت