DocumentCode :
3167144
Title :
Lazy Bagging for Classifying Imbalanced Data
Author :
Zhu, Xingquan
Author_Institution :
Florida Atlantic Univ., Boca Raton
fYear :
2007
fDate :
28-31 Oct. 2007
Firstpage :
763
Lastpage :
768
Abstract :
In this paper, we propose a lazy bagging (LB) design, which builds bootstrap replicate bags based on the characteristics of the test instances. Upon receiving a test instance Ik, LB will trim bootstrap bags by taking Ik´s nearest neighbors in the training set into consideration. Our hypothesis is that an unlabeled instance´s nearest neighbors provide valuable information for learners to refine their local decision boundaries for classifying this instance. By taking full advantage of Ik´s nearest neighbors, the base learners are able to receive less bias and variance in classifying Ik. This strategy is beneficial for classifying imbalanced data because refining local decision boundaries can help a learner reduce its inherent bias towards the majority class and improve its performance on minority class examples. Our experimental results will confirm that LB outperforms C4.5 and TB in terms of reducing classification error, and most importantly this error reduction is largely contributed from LB´s improvement on minority class examples.
Keywords :
learning (artificial intelligence); pattern classification; imbalanced data classification; lazy bagging design; local decision boundary; supervised classification learning; test instance; Accuracy; Bagging; Computer science; Data engineering; Data mining; Decision theory; Decision trees; Design engineering; Nearest neighbor searches; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
Conference_Location :
Omaha, NE
ISSN :
1550-4786
Print_ISBN :
978-0-7695-3018-5
Type :
conf
DOI :
10.1109/ICDM.2007.95
Filename :
4470324
Link To Document :
بازگشت