DocumentCode :
1471607
Title :
Multiclass Imbalance Problems: Analysis and Potential Solutions
Author :
Shuo Wang ; Xin Yao
Author_Institution :
Centre of Excellence for Res. in Comput. Intell. & Applic., Univ. of Birmingham, Birmingham, UK
Volume :
42
Issue :
4
fYear :
2012
Firstpage :
1119
Lastpage :
1130
Abstract :
Class imbalance problems have drawn growing interest recently because of their classification difficulty caused by the imbalanced class distributions. In particular, many ensemble methods have been proposed to deal with such imbalance. However, most efforts so far are only focused on two-class imbalance problems. There are unsolved issues in multiclass imbalance problems, which exist in real-world applications. This paper studies the challenges posed by the multiclass imbalance problems and investigates the generalization ability of some ensemble solutions, including our recently proposed algorithm AdaBoost.NC, with the aim of handling multiclass and imbalance effectively and directly. We first study the impact of multiminority and multimajority on the performance of two basic resampling techniques. They both present strong negative effects. “Multimajority” tends to be more harmful to the generalization performance. Motivated by the results, we then apply AdaBoost.NC to several real-world multiclass imbalance tasks and compare it to other popular ensemble methods. AdaBoost.NC is shown to be better at recognizing minority class examples and balancing the performance among classes in terms of G-mean without using any class decomposition.
Keywords :
learning (artificial intelligence); pattern classification; sampling methods; AdaBoost.NC algorithm; G-mean; classification problem; ensemble learning; ensemble method; generalization ability; generalization performance; imbalanced class distribution; minority class recognition; multiclass imbalance problem; multimajority impact; multiminority impact; resampling techniques; two-class imbalance problems; Correlation; Cybernetics; Genetic algorithms; IEEE Potentials; Pattern analysis; Training; Training data; Boosting; diversity; ensemble learning; multiclass imbalance problems; negative correlation learning;
fLanguage :
English
Journal_Title :
Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
Publisher :
ieee
ISSN :
1083-4419
Type :
jour
DOI :
10.1109/TSMCB.2012.2187280
Filename :
6170916
Link To Document :
بازگشت