Title :
Resampling or Reweighting: A Comparison of Boosting Implementations
Author :
Seiffert, Chris ; Khoshgoftaar, Taghi M. ; Hulse, Jason Van ; Napolitano, Amri
Author_Institution :
Florida Atlantic Univ., Boca Raton, FL
Abstract :
Boosting has been shown to improve the performance of classifiers in many situations, including when data is imbalanced. There are, however, two possible implementations of boosting, and it is unclear which should be used. Boosting by reweighting is typically used, but can only be applied to base learners which are designed to handle example weights. On the other hand, boosting by resampling can be applied to any base learner. In this work, we empirically evaluate the differences between these two boosting implementations using imbalanced training data. Using 10 boosting algorithms, 4 learners and 15 datasets, we find that boosting by resampling performs as well as, or significantly better than, boosting by reweighting (which is often the default boosting implementation). We therefore conclude that in general, boosting by resampling is preferred over boosting by weighting.
Keywords :
learning (artificial intelligence); pattern classification; sampling methods; AdaBoost; boosting algorithm; imbalanced training data classification; resampling method; reweighting method; Algorithm design and analysis; Artificial intelligence; Boosting; Data mining; Iterative algorithms; Medical diagnosis; Sampling methods; Training data; USA Councils; Voting;
Conference_Titel :
Tools with Artificial Intelligence, 2008. ICTAI '08. 20th IEEE International Conference on
Conference_Location :
Dayton, OH
Print_ISBN :
978-0-7695-3440-4
DOI :
10.1109/ICTAI.2008.59