DocumentCode :
3026759
Title :
A hybrid feature subset selection by combining filters and genetic algorithm
Author :
Singh, Suriender ; Selvakumar, S.
Author_Institution :
Dept. of Comput. Sci. & Eng., Nat. Inst. of Technol., Tiruchirappalli, India
fYear :
2015
fDate :
15-16 May 2015
Firstpage :
283
Lastpage :
289
Abstract :
The presence of a large number of irrelevant features degrades the classifier accuracy, reduces the understanding of data, and increases the overall time needed for training and classification. Hence, Feature selection is a critical step in the machine learning process. The role of feature selection is to select a subset of size `d´ (d<;n) from the given set of `n´ features that leads to the smallest classification error. Feature selection problem can be seen as the optimization problem where the goal is to pick the optimal or near optimal feature subset with respect to an objective function. Based on the literature, it is intuitively felt that the classifier will give its optimum performance if the high dimensional data is reduced to include only relevant attributes with low redundancy. Further, it is seen that the filter method is performance centric and the genetic algorithms are insensitive to noise data. This motivated us to combine the advantages of filter method with the genetic algorithm to make a hybrid system to select the optimal feature subset from the given original feature set. The contribution of this paper includes, simultaneous optimization of feature subset and classifier parameters, a multi-objective function that reduces the classification error with reduction in cardinality of feature subset and its cost. The vital aspect of this model is to generate an initial population through various filter approaches for the initialization stage. Further, to evaluate the effectiveness of the model, experiments were conducted using KNN and decision tree (such as cart) on various UCI machine learning and generated datasets. The experiment results show that the proposed model effectively reduces the number of features without degrading the classification accuracy.
Keywords :
data reduction; decision trees; feature selection; genetic algorithms; information filtering; learning (artificial intelligence); pattern classification; KNN; UCI machine learning process; classification error reduction; combining filter method; data classification; decision tree; genetic algorithm; high dimensional data reduction; hybrid feature subset selection; multiobjective function; noise data; optimization problem; Accuracy; Decision trees; Frequency selective surfaces; Genetic algorithms; Optimization; Sociology; Statistics; Chi-Square; Decision tree; Dimensionality reduction; Genetic Algorithm; Machine learning; Ratio Gain;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computing, Communication & Automation (ICCCA), 2015 International Conference on
Conference_Location :
Noida
Print_ISBN :
978-1-4799-8889-1
Type :
conf
DOI :
10.1109/CCAA.2015.7148389
Filename :
7148389
Link To Document :
بازگشت