• DocumentCode
    693137
  • Title

    Clustering-based subset ensemble learning method for imbalanced data

  • Author

    Xiao-Sheng Hu ; Run-Jing Zhang

  • Author_Institution
    Coll. of Electron. & Inf. Eng., Foshan Univ., Foshan, China
  • Volume
    01
  • fYear
    2013
  • fDate
    14-17 July 2013
  • Firstpage
    35
  • Lastpage
    39
  • Abstract
    In recent research, classification involving imbalanced datasets has received considerable attention. Most classification algorithms tend to predict that most of the incoming data belongs to the majority class, resulting in the poor classification performance in minority class instances, which are usually of much more interest. In this paper we propose a clustering-based subset ensemble learning method for handling class imbalanced problem. In the proposed approach, first, new balanced training datasets are produced using clustering-based under-sampling, then, further classification of new training sets are performed by applying four algorithms: Decision Tree, Naïve Bayes, KNN and SVM, as the base algorithms in combined-bagging. An experimental analysis is carried out over a wide range of highly imbalanced data sets. The results obtained show that our method can improve imbalance classification performance of rare and normal classes stably and effectively.
  • Keywords
    Bayes methods; decision trees; learning (artificial intelligence); pattern classification; pattern clustering; support vector machines; KNN; Naïve Bayes; SVM; balanced training datasets; class imbalanced problem; clustering-based subset ensemble learning method; clustering-based under-sampling; combined-bagging; decision tree; imbalanced dataset classification algorithm; minority class instances; Abstracts; Classification algorithms; Data mining; Learning systems; Niobium; Support vector machines; Vehicles; Classification; Clustering; Ensemble learning; Imbalanced data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics (ICMLC), 2013 International Conference on
  • Conference_Location
    Tianjin
  • Type

    conf

  • DOI
    10.1109/ICMLC.2013.6890440
  • Filename
    6890440