• DocumentCode
    3060183
  • Title

    Using evolutionary sampling to mine imbalanced data

  • Author

    Drown, Dennis J. ; Khoshgoftaar, Taghi M. ; Narayanan, Ramaswamy

  • Author_Institution
    Florida Atlantic Univ., Boca Raton
  • fYear
    2007
  • fDate
    13-15 Dec. 2007
  • Firstpage
    363
  • Lastpage
    368
  • Abstract
    Class imbalance tends to cause inferior performance in data mining learners. Evolutionary sampling is a technique which seeks to counter this problem by using genetic algorithms to evolve a reduced sample of a complete dataset to train a classification model. Evolutionary sampling works to remove noisy and duplicate instances so that the sampled training data will produce a superior classifier. We propose this novel technique as a method to handle severe class imbalance in data mining. This paper presents our research into the the use of evolutionary sampling with C4.5 decision trees and compares the technique´s performance with random undersamp ling.
  • Keywords
    data mining; decision trees; genetic algorithms; random processes; C4.5 decision trees; data mining learners; evolutionary sampling; genetic algorithms; imbalanced data mining; random undersampling; sampled training data; Artificial neural networks; Counting circuits; Data mining; Decision trees; Genetic algorithms; Java; Libraries; Machine learning; Sampling methods; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications, 2007. ICMLA 2007. Sixth International Conference on
  • Conference_Location
    Cincinnati, OH
  • Print_ISBN
    978-0-7695-3069-7
  • Type

    conf

  • DOI
    10.1109/ICMLA.2007.73
  • Filename
    4457257