DocumentCode :
2851174
Title :
Evolutionary Training Set Selection to Optimize C4.5 in Imbalanced Problems
Author :
Garcia, Sergio ; Herrera, Francisco
Author_Institution :
Dept. of Comput. Sci. & Artificial Intell., Univ. of Granada, Granada
fYear :
2008
fDate :
10-12 Sept. 2008
Firstpage :
567
Lastpage :
572
Abstract :
Classification in imbalanced domains is a recent challenge in machine learning. We refer to imbalanced classification when data presents many examples from one class and few from the other class, and the less representative class is the one which has more interest. One of the most used techniques to tackle this problem consists in preprocessing the data previously to the learning process. This preprocessing could be done through under-sampling; removing examples, mainly belonging to the majority class; and over-sampling, by means of replicating or generating new minority examples. This contribution proposes an under-sampling procedure based on evolutionary algorithms to perform a training set selection for optimizing the models obtained by the C4.5 decision tree. The proposal has been compared with other under-sampling and over-sampling techniques and the results are very competitive in terms of accuracy, and the obtained models are more interpretable.
Keywords :
decision trees; evolutionary computation; learning (artificial intelligence); optimisation; C4.5 decision tree; evolutionary training set selection; imbalanced classification; imbalanced domains; learning process; machine learning; optimisation; over-sampling techniques; Artificial intelligence; Computer science; Data mining; Decision trees; Evolutionary computation; Finance; Hybrid intelligent systems; Machine learning; Proposals; World Wide Web;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Hybrid Intelligent Systems, 2008. HIS '08. Eighth International Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-0-7695-3326-1
Electronic_ISBN :
978-0-7695-3326-1
Type :
conf
DOI :
10.1109/HIS.2008.67
Filename :
4626690
Link To Document :
بازگشت