DocumentCode :
3426522
Title :
Incremental Wrapper-based subset Selection with replacement: An advantageous alternative to sequential forward selection
Author :
Bermejo, Pablo ; Gámez, Jose A. ; Puerta, Jose M.
Author_Institution :
Comput. Syst. Dept., Univ. de Castilla-La Mancha, Albacete
fYear :
2009
fDate :
March 30 2009-April 2 2009
Firstpage :
367
Lastpage :
374
Abstract :
This paper deals with the problem of wrapper-based feature subset selection in classification oriented datasets with a (very) large number of attributes. In such datasets sophisticated search algorithms like beam search, branch and bound, best first, genetic algorithms, etc., become intractable in the wrapper approach due to the high number of wrapper evaluations to be carried out. One way to alleviate this problem is to use the so-called filter-wrapper approach or Incremental Wrapper-based Subset Selection (IWSS), which consists in the construction of a ranking among the predictive attributes by using a filter measure, and then a wrapper approach is used guided by the rank. In this way the number of wrapper evaluations is linear with the number of predictive attributes. In this paper we present a contribution to the IWSS approach which helps it to obtain more compact subsets, and consists into allow not only the addition of new attributes but also the interchange with some of the already included in the selected subset. The disadvantage of this novelty is that it grows up the worst-case complexity of IWSS up to O(n2), however, as in the case of the well known sequential forward selection (SFS) the actual number of wrapper evaluations is considerably smaller. Empirical tests over 7 (biological) datasets with a large number of attributes demonstrate the success of the proposed approach when comparing with both IWSS and SFS.
Keywords :
computational complexity; data mining; feature extraction; learning (artificial intelligence); search problems; set theory; classification oriented datasets; feature subset selection; incremental wrapper-based subset selection; search algorithms; sequential forward selection; Concrete; Data mining; Evolutionary computation; Filters; Frequency selective surfaces; Genetic algorithms; Intelligent systems; Structural beams; Supervised learning; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Data Mining, 2009. CIDM '09. IEEE Symposium on
Conference_Location :
Nashville, TN
Print_ISBN :
978-1-4244-2765-9
Type :
conf
DOI :
10.1109/CIDM.2009.4938673
Filename :
4938673
Link To Document :
بازگشت