DocumentCode
2892292
Title
Disturbing Neighbors Ensembles of Trees for Imbalanced Data
Author
Rodriguez, Jeffrey J. ; Diez-Pastor, J.F. ; Maudes, J. ; Garcia-Osorio, C.
Author_Institution
Dept. of Civil Eng., Unviersity of Burgos, Burgos, Spain
Volume
2
fYear
2012
fDate
12-15 Dec. 2012
Firstpage
83
Lastpage
88
Abstract
Disturbing Neighbors (DN) is a method for generating classifier ensembles. Moreover, it can be combined with any other ensemble method, generally improving the results. This paper considers the application of these ensembles to imbalanced data: classification problems where the class proportions are significantly different. DN ensembles are compared and combined with Bagging, using three tree methods as base classifiers: conventional decision trees (C4.5), Hellinger distance decision trees (HDDT) -- a method designed for imbalance data -- and model trees (M5P) -- trees with linear models at the leaves -- . The methods are compared using two collections of imbalanced datasets, with 20 and 66 datasets, respectively. The best results are obtained combining Bagging and DN, using conventional decision trees.
Keywords
data handling; decision trees; pattern classification; Bagging; Hellinger distance decision trees; classification problems; classifier ensembles; conventional decision trees; disturbing neighbors; imbalanced data; linear models; model trees; tree methods; Accuracy; Bagging; Boosting; Data mining; Data models; Decision trees; Hellinger distance decision trees; bagging; classifier ensembles; decision trees; disturbing neighbors; imbalanced data; model trees;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Applications (ICMLA), 2012 11th International Conference on
Conference_Location
Boca Raton, FL
Print_ISBN
978-1-4673-4651-1
Type
conf
DOI
10.1109/ICMLA.2012.181
Filename
6406732
Link To Document