Title :
A new feature selection algorithm in text categorization
Author :
Zhao, Wei ; Wang, Yafei ; Li, Dan
Author_Institution :
Coll. of Inf. Technol., Jilin Agric. Univ., Changchun, China
Abstract :
A major problem with text classification problems is the high dimensionality of the feature space. This paper investigates how genetic algorithm and k-means algorithm can help select relevant features in text classification. which uses the genetic algorithm (GA) optimization features to implement global searching, and uses k-means algorithm to selection operation to control the scope of the search, ensure the validity of each gene and the speed of convergence. Our experimental results show that the combination of GA and k-means algorithm is quite useful in reduce the high feature dimension, and improved accuracy and efficiency for text classification.
Keywords :
genetic algorithms; pattern clustering; text analysis; feature selection algorithm; genetic algorithm; global searching; k-means algorithm; text categorization; text classification problems; Automatic control; Automation; Communication system control; Computer science; Educational institutions; Genetic algorithms; Information technology; Mathematical model; Space technology; Text categorization; feature selection; genetic algorithm; k-means algorithm; text categorization;
Conference_Titel :
Computer Communication Control and Automation (3CA), 2010 International Symposium on
Conference_Location :
Tainan
Print_ISBN :
978-1-4244-5565-2
DOI :
10.1109/3CA.2010.5533870