DocumentCode :
255187
Title :
A clustering based genetic algorithm for feature selection
Author :
Rostami, M. ; Moradi, P.
Author_Institution :
Dept. of Comput. Eng., Univ. of Kurdistan, Sanandaj, Iran
fYear :
2014
fDate :
27-29 May 2014
Firstpage :
112
Lastpage :
116
Abstract :
Feature selection is a fundamental data preprocessing step in data mining, where its goal is removing some irrelevant and/or redundant features from a given dataset. In this paper, we present a clustering based genetic algorithm for feature selection (CGAFS). The proposed algorithm works in three steps. In the first step, Subset size is determined. In the second step, features are divided into clusters using k-means clustering algorithm. Finally, in the third step, features are selected using genetic algorithm with a new clustering based repair operation. The performance of the proposed method has been assessed on five benchmark classification problems. We also compared the performance of CGAFS with the results obtained from four existing well-known feature selection algorithms. The results show that the CGAFS produces consistently better classification accuracies.
Keywords :
data mining; feature selection; genetic algorithms; CGAFS; benchmark classification problems; clustering-based genetic algorithm-for-feature selection; data mining; data preprocessing step; irrelevant feature; k-means clustering algorithm; redundant feature; Accuracy; Benchmark testing; Boolean functions; Cancer; Data structures; Diabetes; Sonar; feature clustering; feature selection; genetic algorithm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information and Knowledge Technology (IKT), 2014 6th Conference on
Conference_Location :
Shahrood
Print_ISBN :
978-1-4799-5658-6
Type :
conf
DOI :
10.1109/IKT.2014.7030343
Filename :
7030343
Link To Document :
بازگشت