Title :
A boosting approach to remove class label noise
Author :
Karmaker, Amitava ; Kwek, Stephen
Author_Institution :
Dept. of Comput. Sci., Texas Univ., San Antonio, TX, USA
Abstract :
Ensemble methods have been known to improve prediction accuracy over the base learning algorithm. AdaBoost is well-recognized for that in its class. However, it is susceptible to overfitting the training instances corrupted by class label noise. This paper proposes a modification to AdaBoost that is more tolerant to class label noise, which further enhances its ability to boost prediction accuracy. In particular, we observe that in Adaboost, the weight-hike of noisy examples can be constrained by careful application of a cut-off in their weights. Effectiveness of our algorithm is demonstrated empirically using some artificially generated data. We also corroborate this on a number of data sets from UCI repository (Blake and Mertz, 1998). In both experimental settings, the results obtained affirm the efficacy of our approach. Finally, some of the significant characteristics of our technique related to noisy environments have been investigated.
Keywords :
learning (artificial intelligence); pattern classification; AdaBoost; UCI repository; class label noise removal; ensemble methods; learning; prediction accuracy; Accuracy; Bagging; Boosting; Computer science; Degradation; Hybrid intelligent systems; Iterative algorithms; Testing; Voting; Working environment noise;
Conference_Titel :
Hybrid Intelligent Systems, 2005. HIS '05. Fifth International Conference on
Print_ISBN :
0-7695-2457-5
DOI :
10.1109/ICHIS.2005.1