DocumentCode :
1574045
Title :
A Cluster-based Regrouping approach for Imbalanced data distributions
Author :
Yu, Wen ; Jiang, ShengYi
Author_Institution :
School of Management, Guangdong University of Foreign Studies, Guangzhou 510006, China
fYear :
2012
Firstpage :
121
Lastpage :
124
Abstract :
In real-world applications, it has been observed that class imbalance (significant differences in class prior probabilities) may produce an important deterioration of the classifier performance, in particular with patterns belonging to the less represented classes. In this paper, we propose a Cluster-based Regrouping approach (CR) which divides the whole training data into positive group and negative group by clustering through the outlier factor. As a result, the similar samples will be in the same group while the dissimilar samples will be in the different groups. Then the basic classifier is employed to build the models on both the positive group and the negative group respectively. When classifying the new object, the model used to evaluate will be chosen according to the type of the group which the new object is nearest. The experimental results demonstrate that our approach achieved promising performance in some cases by directly or indirectly reducing the class distribution skewness.
Keywords :
C4.5; Imbalanced data classification; Naïve-bayes; One-pass clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
World Automation Congress (WAC), 2012
Conference_Location :
Puerto Vallarta, Mexico
ISSN :
2154-4824
Print_ISBN :
978-1-4673-4497-5
Type :
conf
Filename :
6321051
Link To Document :
بازگشت