• Title of article

    Diversification for better classification trees

  • Author/Authors

    Zhiwei Fu، نويسنده , , Bruce L. Golden، نويسنده , , Shreevardhan Lele، نويسنده , , S. Raghavan، نويسنده , , Edward Wasil، نويسنده ,

  • Issue Information
    ماهنامه با شماره پیاپی سال 2006
  • Pages
    18
  • From page
    3185
  • To page
    3202
  • Abstract
    Classification trees are widely used in the data mining community. Typically, trees are constructed to try and maximize their mean classification accuracy. In this paper, we propose an alternative to using the mean accuracy as the performance measure of a tree. We investigate the use of various percentiles (representing the risk aversion of a decision maker) of the distribution of classification accuracy in place of the mean. We develop a genetic algorithm (GA) to build decision trees based on this new criterion. We develop this GA further by explicitly creating diversity in the population by simultaneously considering two fitness criteria within the GA. We show that our bicriterion GA performs quite well, scales up to handle large data sets, and requires a small sample of the original data to build a good decision tree.
  • Keywords
    Classification trees , Genetic Algorithm , Data mining
  • Journal title
    Computers and Operations Research
  • Serial Year
    2006
  • Journal title
    Computers and Operations Research
  • Record number

    928815