Title :
Confident wrapper-type semi-supervised feature selection using an ensemble classifier
Author :
Han, Yongkoo ; Park, Kisung ; Lee, Young-Koo
Author_Institution :
Dept. of Comput. Eng., Kyung Hee Univ., Suwon, South Korea
Abstract :
Feature selection is an important data preprocessing step in pattern recognition. Recently, a wrapper-type semi-supervised feature selection method, known as FW-SemiFS, was proposed to overcome the small labeled sample problem of supervised feature selection. FW-SemiFS does not consider the confidence of predicted unlabeled data, but rather evaluates the relevance of features according to their frequency. Such frequencies are obtained via iterative supervised sequential forward feature selection (SFFS). However, the large amount of computational time associated with iterative SFFS is detrimental to FW-SemiFS. Furthermore, this relevance evaluation method eliminates the primary advantage of wrapper-type feature selection: the ability to evaluate the discriminative power of a combination of features. In this paper, we propose a new wrapper-type semi-supervised feature selection framework that can select a more relevant feature subset using confident unlabeled data. The proposed framework, called ensemble-based semi-supervised feature selection (EN-SemiFS), employs an ensemble classifier that supports the estimation of the confidence of unlabeled data. We analyzed the relationship between wrapper-type feature selection and the confidence of unlabeled data and explored how this relationship can make the semisupervised feature selection framework faster and more accurate. The experimental results revealed that the proposed method can select a more relevant feature subset when compared to existing methods.
Keywords :
data handling; feature extraction; learning (artificial intelligence); pattern classification; FW-SemiFS; confident unlabeled data; ensemble classifier; ensemble-based semi-supervised feature selection; iterative supervised sequential forward feature selection; pattern recognition; wrapper-type semi-supervised feature selection; Accuracy; Classification algorithms; Information filters; Prediction algorithms; Training; Training data; ensemble learning; feature selection; labeled data; semisupervised feature selection; unlabeled data;
Conference_Titel :
Artificial Intelligence, Management Science and Electronic Commerce (AIMSEC), 2011 2nd International Conference on
Conference_Location :
Deng Leng
Print_ISBN :
978-1-4577-0535-9
DOI :
10.1109/AIMSEC.2011.6010202