DocumentCode
1937489
Title
A Weighted Rough Set Method to Address the Class Imbalance Problem
Author
Liu, Jin-Fu ; Yu, Da-Ren
Author_Institution
Harbin Inst. of Technol., Harbin
Volume
7
fYear
2007
fDate
19-22 Aug. 2007
Firstpage
3693
Lastpage
3698
Abstract
The class imbalance problem has been said recently to hinder the performance of learning systems. Most of traditional learning algorithms are designed with the assumption of well-balanced datasets, and are biased towards the majority class and thus may predict poorly the minority class examples. In this paper, we develop weighted rough sets (WRS) to deal with this problem. In weighted rough sets, weighted entropy is introduced and extended to compute the information content introduced by attributes. A forward greedy weighted attribute reduction algorithm based on the weighted entropy and a weighted rule extraction algorithm are provided. The factors of weighted strength, weighted certainty and weighted cover are employed to evaluate the extracted rules. Finally, a decision algorithm based on the weighted strength factor is constructed. Based on weighted rough sets, a series of experiments on class imbalance learning are conducted on 20 UCI data sets. In the meaning of AUC and minority class accuracy, WRS achieves the better results than classical rough set in class imbalance learning. Moreover, the evaluation of extracted rules has greater influence than the selection of attributes on weighted rough set learning.
Keywords
decision theory; entropy; greedy algorithms; learning (artificial intelligence); rough set theory; class imbalance problem; decision algorithm; forward greedy weighted attribute reduction algorithm; learning algorithm; learning system; weighted certainty; weighted cover; weighted entropy; weighted rough set; weighted rule extraction algorithm; weighted strength; Algorithm design and analysis; Cybernetics; Data mining; Entropy; Information systems; Learning systems; Machine learning; Machine learning algorithms; Rough sets; Training data; Class imbalance learning; Instance weighting; Rough sets; Rule extraction; Weighted entropy;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2007 International Conference on
Conference_Location
Hong Kong
Print_ISBN
978-1-4244-0973-0
Electronic_ISBN
978-1-4244-0973-0
Type
conf
DOI
10.1109/ICMLC.2007.4370789
Filename
4370789
Link To Document