DocumentCode
3310197
Title
Minority split and gain ratio for a class imbalance
Author
Boonchuay, K. ; Sinapiromsaran, Krung ; Lursinsap, C.
Author_Institution
Dept. of Math., Chulalongkorn Univ., Bangkok, Thailand
Volume
3
fYear
2011
fDate
26-28 July 2011
Firstpage
2060
Lastpage
2064
Abstract
A decision tree is one of most popular classifiers that classifies a balanced data set effectively. For an imbalanced data set, a standard decision tree tends to misclassify instances of a class having tiny number of samples. In this paper, we modify the decision tree induction algorithm by performing a ternary split on continuous-valued attributes focusing on distribution of minority class instances. The algorithm uses the minority variance to rank candidates of the high gain ratio, then it chooses the candidate with the minimum minority entropy. From our experiments with data sets from UCI and Statlog repository, this method achieves the better performance comparing with C4.5 using only gain ratio for imbalanced data sets.
Keywords
decision trees; entropy; pattern classification; Statlog repository; UCI repository; balanced data set; class imbalance; decision tree induction algorithm; imbalanced data set; minimum minority entropy; minority class instance distribution; minority gain ratio; minority split ratio; Accuracy; Classification algorithms; Decision trees; Entropy; Impurities; Machine learning; Training; Class imbalance; Classification; Decision tree; Gain Ratio; Minority split;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Conference_Location
Shanghai
Print_ISBN
978-1-61284-180-9
Type
conf
DOI
10.1109/FSKD.2011.6019836
Filename
6019836
Link To Document