Title :
Feature Selection Based on Dependency Margin
Author :
Yong Liu ; Feng Tang ; Zhiyong Zeng
Author_Institution :
State Key Lab. of Ind. Control Technol., Zhejiang Univ., Hangzhou, China
Abstract :
Feature selection tries to find a subset of feature from a larger feature pool and the selected subset can provide the same or even better performance compared with using the whole set. Feature selection is usually a critical preprocessing step for many machine-learning applications such as clustering and classification. In this paper, we focus on feature selection for supervised classification which targets at finding features that can best predict class labels. Traditional greedy search algorithms incrementally find features based on the relevance of candidate features and the class label. However, this may lead to suboptimal results when there are redundant features that may interfere with the selection. To solve this problem, we propose a subset selection algorithm that considers both the selected and remaining features´ relevances with the label. The intuition is that features, which do not have better alternatives from the feature set, should be selected first. We formulate the selection problem as maximizing the dependency margin which is measured by the difference between the selected feature set performance and the remaining feature set performance. Extensive experiments on various data sets show the superiority of the proposed approach against traditional algorithms.
Keywords :
feature selection; greedy algorithms; learning (artificial intelligence); pattern classification; search problems; set theory; dependency margin; feature pool; feature relevances; feature selection; feature set performance; greedy search algorithms; machine-learning applications; subset selection algorithm; supervised classification; Approximation algorithms; Bayes methods; Markov processes; Prediction algorithms; Redundancy; Search problems; Silicon; Conditionally independent; dependency margin; feature selection; forward greedy search; redundant feature;
Journal_Title :
Cybernetics, IEEE Transactions on
DOI :
10.1109/TCYB.2014.2347372