• DocumentCode
    255187
  • Title

    A clustering based genetic algorithm for feature selection

  • Author

    Rostami, M. ; Moradi, P.

  • Author_Institution
    Dept. of Comput. Eng., Univ. of Kurdistan, Sanandaj, Iran
  • fYear
    2014
  • fDate
    27-29 May 2014
  • Firstpage
    112
  • Lastpage
    116
  • Abstract
    Feature selection is a fundamental data preprocessing step in data mining, where its goal is removing some irrelevant and/or redundant features from a given dataset. In this paper, we present a clustering based genetic algorithm for feature selection (CGAFS). The proposed algorithm works in three steps. In the first step, Subset size is determined. In the second step, features are divided into clusters using k-means clustering algorithm. Finally, in the third step, features are selected using genetic algorithm with a new clustering based repair operation. The performance of the proposed method has been assessed on five benchmark classification problems. We also compared the performance of CGAFS with the results obtained from four existing well-known feature selection algorithms. The results show that the CGAFS produces consistently better classification accuracies.
  • Keywords
    data mining; feature selection; genetic algorithms; CGAFS; benchmark classification problems; clustering-based genetic algorithm-for-feature selection; data mining; data preprocessing step; irrelevant feature; k-means clustering algorithm; redundant feature; Accuracy; Benchmark testing; Boolean functions; Cancer; Data structures; Diabetes; Sonar; feature clustering; feature selection; genetic algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information and Knowledge Technology (IKT), 2014 6th Conference on
  • Conference_Location
    Shahrood
  • Print_ISBN
    978-1-4799-5658-6
  • Type

    conf

  • DOI
    10.1109/IKT.2014.7030343
  • Filename
    7030343