Title :
A hybrid feature selection method for data sets of thousands of variables
Author :
Liu, Jihong ; Wang, Guoxiong
Author_Institution :
Coll. of Inf. Sci. & Eng., Northeastern Univ., Shenyang, China
Abstract :
Feature selection has become the focus of research areas of applications with datasets of thousands of variables. In this study we present a hybrid feature selection (HFS) method that adopts both filter and wrapper models of feature subset selection. In the first stage of the feature selection, we use the filter model to rank the features by the mutual information (MI) between each feature and each class, and then choose k highest relevant features to the classes. In the second stage, we complete a wrapper model based feature selection algorithm, which uses Shepley value to evaluate the contribution of features to the classification task in a feature subset. Experimental results show obviously that the HFS method obtains better classification performance than solo Shepley value based or solo MI based feature selection method.
Keywords :
classification; game theory; Shepley value; classification performance; classification task; data sets; feature selection algorithm; feature subset selection; filter model; hybrid feature selection method; mutual information; wrapper model; Classification algorithms; Data engineering; Educational institutions; Information filtering; Information filters; Information science; Internet; Mutual information; Space exploration; Text processing; Shepley value; feature selection; mutual information;
Conference_Titel :
Advanced Computer Control (ICACC), 2010 2nd International Conference on
Conference_Location :
Shenyang
Print_ISBN :
978-1-4244-5845-5
DOI :
10.1109/ICACC.2010.5486671