DocumentCode :
2479029
Title :
RUSBoost: Improving classification performance when training data is skewed
Author :
Seiffert, Chris ; Khoshgoftaar, Taghi M. ; Van Hulse, Jason ; Napolitano, Amri
Author_Institution :
Florida Atlantic Univ., Boca Raton, FL
fYear :
2008
fDate :
8-11 Dec. 2008
Firstpage :
1
Lastpage :
4
Abstract :
Constructing classification models using skewed training data can be a challenging task. We present RUSBoost, a new algorithm for alleviating the problem of class imbalance. RUSBoost combines data sampling and boosting, providing a simple and efficient method for improving classification performance when training data is imbalanced. In addition to performing favorably when compared to SMOTEBoost (another hybrid sampling/boosting algorithm), RUSBoost is computationally less expensive than SMOTEBoost and results in significantly shorter model training times. This combination of simplicity, speed and performance makes RUSBoost an excellent technique for learning from imbalanced data.
Keywords :
data mining; learning (artificial intelligence); pattern classification; boosting algorithm; class imbalance; classification model; data mining; data sampling; machine learning; skewed training data; Algorithm design and analysis; Boosting; Costs; Data mining; Diseases; Iterative algorithms; Sampling methods; Training data; Voting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition, 2008. ICPR 2008. 19th International Conference on
Conference_Location :
Tampa, FL
ISSN :
1051-4651
Print_ISBN :
978-1-4244-2174-9
Electronic_ISBN :
1051-4651
Type :
conf
DOI :
10.1109/ICPR.2008.4761297
Filename :
4761297
Link To Document :
بازگشت