Title :
Improved IG Approach Based on Compensation Factor and Penalty Factor for Feature Distribution
Author :
Zhang, Yu ; Zhang, De-Xian
Author_Institution :
Coll. of Inf. Sci. & Technol., Henan Univ. of Technol., Zhengzhou, China
Abstract :
Information Gain algorithm for text feature selection usually leads to some features which are low-frequency in the designated category but high-frequency in other categories to be selected , this is clearly not the desired results for feature selection. To overcome the shortage, this paper proposes an improved IG approach based on Compensation Factor and Penalty Factor for feature distribution. An experiment is carried out and the results show that the improved method can effectively balance the information content for feature appearing or not, and achieve the better classification results.
Keywords :
information retrieval; pattern classification; text analysis; compensation factor; feature distribution; information gain algorithm; penalty factor; text feature selection; Classification algorithms; Machine learning; Space vehicles; Support vector machine classification; Text categorization; Training;
Conference_Titel :
Pattern Recognition (CCPR), 2010 Chinese Conference on
Conference_Location :
Chongqing
Print_ISBN :
978-1-4244-7209-3
Electronic_ISBN :
978-1-4244-7210-9
DOI :
10.1109/CCPR.2010.5659145