• DocumentCode
    3426522
  • Title

    Incremental Wrapper-based subset Selection with replacement: An advantageous alternative to sequential forward selection

  • Author

    Bermejo, Pablo ; Gámez, Jose A. ; Puerta, Jose M.

  • Author_Institution
    Comput. Syst. Dept., Univ. de Castilla-La Mancha, Albacete
  • fYear
    2009
  • fDate
    March 30 2009-April 2 2009
  • Firstpage
    367
  • Lastpage
    374
  • Abstract
    This paper deals with the problem of wrapper-based feature subset selection in classification oriented datasets with a (very) large number of attributes. In such datasets sophisticated search algorithms like beam search, branch and bound, best first, genetic algorithms, etc., become intractable in the wrapper approach due to the high number of wrapper evaluations to be carried out. One way to alleviate this problem is to use the so-called filter-wrapper approach or Incremental Wrapper-based Subset Selection (IWSS), which consists in the construction of a ranking among the predictive attributes by using a filter measure, and then a wrapper approach is used guided by the rank. In this way the number of wrapper evaluations is linear with the number of predictive attributes. In this paper we present a contribution to the IWSS approach which helps it to obtain more compact subsets, and consists into allow not only the addition of new attributes but also the interchange with some of the already included in the selected subset. The disadvantage of this novelty is that it grows up the worst-case complexity of IWSS up to O(n2), however, as in the case of the well known sequential forward selection (SFS) the actual number of wrapper evaluations is considerably smaller. Empirical tests over 7 (biological) datasets with a large number of attributes demonstrate the success of the proposed approach when comparing with both IWSS and SFS.
  • Keywords
    computational complexity; data mining; feature extraction; learning (artificial intelligence); search problems; set theory; classification oriented datasets; feature subset selection; incremental wrapper-based subset selection; search algorithms; sequential forward selection; Concrete; Data mining; Evolutionary computation; Filters; Frequency selective surfaces; Genetic algorithms; Intelligent systems; Structural beams; Supervised learning; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Data Mining, 2009. CIDM '09. IEEE Symposium on
  • Conference_Location
    Nashville, TN
  • Print_ISBN
    978-1-4244-2765-9
  • Type

    conf

  • DOI
    10.1109/CIDM.2009.4938673
  • Filename
    4938673