Title :
Sampling + reweighting: Boosting the performance of AdaBoost on imbalanced datasets
Author :
Yuan, Bo ; Ma, Xiaoli
Author_Institution :
Intell. Comput. Lab., Tsinghua Univ., Shenzhen, China
Abstract :
Existing attempts to improve the performance of AdaBoost on imbalanced datasets have largely been focused on modifying its weight updating rule or incorporating sampling or cost sensitive learning techniques. In this paper, we propose to tackle the challenge from a novel perspective. Initially, the dataset is over-sampled and the standard AdaBoost is applied to create a series of base classifiers. Next, the weights of the classifiers are further retrained by Genetic Algorithms (GAs) or comparable optimization techniques where more targeted performance measures such as G-mean and F-measure can be directly used as the objective function. Consequently, unlike other indirect solutions, this sampling + reweighting strategy can purposefully tune AdaBoost towards a certain performance measure of interest with only moderate computational overhead. Experimental results on ten benchmark datasets show that this strategy can reliably boost the performance of AdaBoost and has consistent superiority over EasyEnsemble, which is a competent ensemble method for class imbalance learning.
Keywords :
genetic algorithms; learning (artificial intelligence); EasyEnsemble; F-measure; G-mean; GA; benchmark datasets; class imbalance learning; genetic algorithms; imbalanced datasets; optimization techniques; reweighting; sampling; sensitive learning techniques; standard AdaBoost; Accuracy; Cancer; Glass; Heart; Single photon emission computed tomography; Standards; Training; AdaBoost; Class Imbalance Learning; GAs; SMOTE;
Conference_Titel :
Neural Networks (IJCNN), The 2012 International Joint Conference on
Conference_Location :
Brisbane, QLD
Print_ISBN :
978-1-4673-1488-6
Electronic_ISBN :
2161-4393
DOI :
10.1109/IJCNN.2012.6252738