DocumentCode
3060183
Title
Using evolutionary sampling to mine imbalanced data
Author
Drown, Dennis J. ; Khoshgoftaar, Taghi M. ; Narayanan, Ramaswamy
Author_Institution
Florida Atlantic Univ., Boca Raton
fYear
2007
fDate
13-15 Dec. 2007
Firstpage
363
Lastpage
368
Abstract
Class imbalance tends to cause inferior performance in data mining learners. Evolutionary sampling is a technique which seeks to counter this problem by using genetic algorithms to evolve a reduced sample of a complete dataset to train a classification model. Evolutionary sampling works to remove noisy and duplicate instances so that the sampled training data will produce a superior classifier. We propose this novel technique as a method to handle severe class imbalance in data mining. This paper presents our research into the the use of evolutionary sampling with C4.5 decision trees and compares the technique´s performance with random undersamp ling.
Keywords
data mining; decision trees; genetic algorithms; random processes; C4.5 decision trees; data mining learners; evolutionary sampling; genetic algorithms; imbalanced data mining; random undersampling; sampled training data; Artificial neural networks; Counting circuits; Data mining; Decision trees; Genetic algorithms; Java; Libraries; Machine learning; Sampling methods; Training data;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Applications, 2007. ICMLA 2007. Sixth International Conference on
Conference_Location
Cincinnati, OH
Print_ISBN
978-0-7695-3069-7
Type
conf
DOI
10.1109/ICMLA.2007.73
Filename
4457257
Link To Document