DocumentCode :
3336510
Title :
Resampling or Reweighting: A Comparison of Boosting Implementations
Author :
Seiffert, Chris ; Khoshgoftaar, Taghi M. ; Hulse, Jason Van ; Napolitano, Amri
Author_Institution :
Florida Atlantic Univ., Boca Raton, FL
Volume :
1
fYear :
2008
fDate :
3-5 Nov. 2008
Firstpage :
445
Lastpage :
451
Abstract :
Boosting has been shown to improve the performance of classifiers in many situations, including when data is imbalanced. There are, however, two possible implementations of boosting, and it is unclear which should be used. Boosting by reweighting is typically used, but can only be applied to base learners which are designed to handle example weights. On the other hand, boosting by resampling can be applied to any base learner. In this work, we empirically evaluate the differences between these two boosting implementations using imbalanced training data. Using 10 boosting algorithms, 4 learners and 15 datasets, we find that boosting by resampling performs as well as, or significantly better than, boosting by reweighting (which is often the default boosting implementation). We therefore conclude that in general, boosting by resampling is preferred over boosting by weighting.
Keywords :
learning (artificial intelligence); pattern classification; sampling methods; AdaBoost; boosting algorithm; imbalanced training data classification; resampling method; reweighting method; Algorithm design and analysis; Artificial intelligence; Boosting; Data mining; Iterative algorithms; Medical diagnosis; Sampling methods; Training data; USA Councils; Voting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence, 2008. ICTAI '08. 20th IEEE International Conference on
Conference_Location :
Dayton, OH
ISSN :
1082-3409
Print_ISBN :
978-0-7695-3440-4
Type :
conf
DOI :
10.1109/ICTAI.2008.59
Filename :
4669722
Link To Document :
بازگشت