DocumentCode
478617
Title
Improving Learner Performance with Data Sampling and Boosting
Author
Seiffert, Chris ; Khoshgoftaar, Taghi M. ; Van Hulse, Jason ; Napolitano, Amri
Author_Institution
Florida Atlantic Univ., Boca Raton, FL
Volume
1
fYear
2008
fDate
3-5 Nov. 2008
Firstpage
452
Lastpage
459
Abstract
Learning from imbalanced datasets is a well known problem in the data mining community. Many techniques have been proposed to alleviate the problems associated with class imbalance, including data sampling and boosting. While data sampling has received the bulk of the attention from the research community, our results show that boosting often results in better classification performance than even the best data sampling techniques. In this work, we compare the performance of data sampling and boosting on ten datasets from various application domains using two commonly used learners. In addition, we propose the use of both data sampling and boosting in an attempt to combine the strengths of these techniques and achieve even better classification performance.
Keywords
data mining; learning (artificial intelligence); boosting; data mining; data sampling techniques; learning; Artificial intelligence; Boosting; Costs; Data mining; Iterative algorithms; Learning; Sampling methods; Training data; USA Councils;
fLanguage
English
Publisher
ieee
Conference_Titel
Tools with Artificial Intelligence, 2008. ICTAI '08. 20th IEEE International Conference on
Conference_Location
Dayton, OH
ISSN
1082-3409
Print_ISBN
978-0-7695-3440-4
Type
conf
DOI
10.1109/ICTAI.2008.58
Filename
4669723
Link To Document