• DocumentCode
    1445991
  • Title

    A Filter Approach to Multiple Feature Construction for Symbolic Learning Classifiers Using Genetic Programming

  • Author

    Neshatian, Kourosh ; Zhang, Mengjie ; Andreae, Peter

  • Author_Institution
    Sch. of Eng. & Comput. Sci., Victoria Univ. of Wellington, Wellington, New Zealand
  • Volume
    16
  • Issue
    5
  • fYear
    2012
  • Firstpage
    645
  • Lastpage
    661
  • Abstract
    Feature construction is an effort to transform the input space of classification problems in order to improve the classification performance. Feature construction is particularly important for classifier inducers that cannot transform their input space intrinsically. This paper proposes GPMFC, a multiple-feature construction system for classification problems using genetic programming (GP). This paper takes a nonwrapper approach by introducing a filter-based measure of goodness for constructed features. The constructed, high-level features are functions of original input features. These functions are evolved by GP using an entropy-based fitness function that maximizes the purity of class intervals. A decomposable objective function is proposed so that the system is able to construct multiple high-level features for each problem. The constructed features are used to transform the original input space to a new space with better separability. Extensive experiments are conducted on a number of benchmark problems and symbolic learning classifiers. The results show that, in most cases, the new approach is highly effective in increasing the classification performance in rule-based and decision tree classifiers. The constructed features help improve the learning performance of symbolic learners. The constructed features, however, may lack intelligibility.
  • Keywords
    benchmark testing; decision trees; entropy; feature extraction; filters; genetic algorithms; knowledge based systems; learning (artificial intelligence); pattern classification; GP; GPMFC; benchmark problems; class intervals; classification problems; classifier inducers; decision tree classifiers; decomposable objective function; entropy-based fitness function; filter approach; filter-based measurement; genetic programming; learning performance; multiple high-level features; multiple-feature construction system; nonwrapper approach; rule-based classifiers; symbolic learning classifiers; Biological cells; Decision trees; Feature extraction; Genetic programming; Machine learning algorithms; Numerical models; Transforms; Classification; decision trees; feature construction; genetic programming (GP); rule-based systems;
  • fLanguage
    English
  • Journal_Title
    Evolutionary Computation, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1089-778X
  • Type

    jour

  • DOI
    10.1109/TEVC.2011.2166158
  • Filename
    6151112