Title :
A clustering based genetic algorithm for feature selection
Author :
Rostami, M. ; Moradi, P.
Author_Institution :
Dept. of Comput. Eng., Univ. of Kurdistan, Sanandaj, Iran
Abstract :
Feature selection is a fundamental data preprocessing step in data mining, where its goal is removing some irrelevant and/or redundant features from a given dataset. In this paper, we present a clustering based genetic algorithm for feature selection (CGAFS). The proposed algorithm works in three steps. In the first step, Subset size is determined. In the second step, features are divided into clusters using k-means clustering algorithm. Finally, in the third step, features are selected using genetic algorithm with a new clustering based repair operation. The performance of the proposed method has been assessed on five benchmark classification problems. We also compared the performance of CGAFS with the results obtained from four existing well-known feature selection algorithms. The results show that the CGAFS produces consistently better classification accuracies.
Keywords :
data mining; feature selection; genetic algorithms; CGAFS; benchmark classification problems; clustering-based genetic algorithm-for-feature selection; data mining; data preprocessing step; irrelevant feature; k-means clustering algorithm; redundant feature; Accuracy; Benchmark testing; Boolean functions; Cancer; Data structures; Diabetes; Sonar; feature clustering; feature selection; genetic algorithm;
Conference_Titel :
Information and Knowledge Technology (IKT), 2014 6th Conference on
Conference_Location :
Shahrood
Print_ISBN :
978-1-4799-5658-6
DOI :
10.1109/IKT.2014.7030343