• DocumentCode
    2892292
  • Title

    Disturbing Neighbors Ensembles of Trees for Imbalanced Data

  • Author

    Rodriguez, Jeffrey J. ; Diez-Pastor, J.F. ; Maudes, J. ; Garcia-Osorio, C.

  • Author_Institution
    Dept. of Civil Eng., Unviersity of Burgos, Burgos, Spain
  • Volume
    2
  • fYear
    2012
  • fDate
    12-15 Dec. 2012
  • Firstpage
    83
  • Lastpage
    88
  • Abstract
    Disturbing Neighbors (DN) is a method for generating classifier ensembles. Moreover, it can be combined with any other ensemble method, generally improving the results. This paper considers the application of these ensembles to imbalanced data: classification problems where the class proportions are significantly different. DN ensembles are compared and combined with Bagging, using three tree methods as base classifiers: conventional decision trees (C4.5), Hellinger distance decision trees (HDDT) -- a method designed for imbalance data -- and model trees (M5P) -- trees with linear models at the leaves -- . The methods are compared using two collections of imbalanced datasets, with 20 and 66 datasets, respectively. The best results are obtained combining Bagging and DN, using conventional decision trees.
  • Keywords
    data handling; decision trees; pattern classification; Bagging; Hellinger distance decision trees; classification problems; classifier ensembles; conventional decision trees; disturbing neighbors; imbalanced data; linear models; model trees; tree methods; Accuracy; Bagging; Boosting; Data mining; Data models; Decision trees; Hellinger distance decision trees; bagging; classifier ensembles; decision trees; disturbing neighbors; imbalanced data; model trees;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications (ICMLA), 2012 11th International Conference on
  • Conference_Location
    Boca Raton, FL
  • Print_ISBN
    978-1-4673-4651-1
  • Type

    conf

  • DOI
    10.1109/ICMLA.2012.181
  • Filename
    6406732