DocumentCode :
3060183
Title :
Using evolutionary sampling to mine imbalanced data
Author :
Drown, Dennis J. ; Khoshgoftaar, Taghi M. ; Narayanan, Ramaswamy
Author_Institution :
Florida Atlantic Univ., Boca Raton
fYear :
2007
fDate :
13-15 Dec. 2007
Firstpage :
363
Lastpage :
368
Abstract :
Class imbalance tends to cause inferior performance in data mining learners. Evolutionary sampling is a technique which seeks to counter this problem by using genetic algorithms to evolve a reduced sample of a complete dataset to train a classification model. Evolutionary sampling works to remove noisy and duplicate instances so that the sampled training data will produce a superior classifier. We propose this novel technique as a method to handle severe class imbalance in data mining. This paper presents our research into the the use of evolutionary sampling with C4.5 decision trees and compares the technique´s performance with random undersamp ling.
Keywords :
data mining; decision trees; genetic algorithms; random processes; C4.5 decision trees; data mining learners; evolutionary sampling; genetic algorithms; imbalanced data mining; random undersampling; sampled training data; Artificial neural networks; Counting circuits; Data mining; Decision trees; Genetic algorithms; Java; Libraries; Machine learning; Sampling methods; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications, 2007. ICMLA 2007. Sixth International Conference on
Conference_Location :
Cincinnati, OH
Print_ISBN :
978-0-7695-3069-7
Type :
conf
DOI :
10.1109/ICMLA.2007.73
Filename :
4457257
Link To Document :
بازگشت