• DocumentCode
    1937489
  • Title

    A Weighted Rough Set Method to Address the Class Imbalance Problem

  • Author

    Liu, Jin-Fu ; Yu, Da-Ren

  • Author_Institution
    Harbin Inst. of Technol., Harbin
  • Volume
    7
  • fYear
    2007
  • fDate
    19-22 Aug. 2007
  • Firstpage
    3693
  • Lastpage
    3698
  • Abstract
    The class imbalance problem has been said recently to hinder the performance of learning systems. Most of traditional learning algorithms are designed with the assumption of well-balanced datasets, and are biased towards the majority class and thus may predict poorly the minority class examples. In this paper, we develop weighted rough sets (WRS) to deal with this problem. In weighted rough sets, weighted entropy is introduced and extended to compute the information content introduced by attributes. A forward greedy weighted attribute reduction algorithm based on the weighted entropy and a weighted rule extraction algorithm are provided. The factors of weighted strength, weighted certainty and weighted cover are employed to evaluate the extracted rules. Finally, a decision algorithm based on the weighted strength factor is constructed. Based on weighted rough sets, a series of experiments on class imbalance learning are conducted on 20 UCI data sets. In the meaning of AUC and minority class accuracy, WRS achieves the better results than classical rough set in class imbalance learning. Moreover, the evaluation of extracted rules has greater influence than the selection of attributes on weighted rough set learning.
  • Keywords
    decision theory; entropy; greedy algorithms; learning (artificial intelligence); rough set theory; class imbalance problem; decision algorithm; forward greedy weighted attribute reduction algorithm; learning algorithm; learning system; weighted certainty; weighted cover; weighted entropy; weighted rough set; weighted rule extraction algorithm; weighted strength; Algorithm design and analysis; Cybernetics; Data mining; Entropy; Information systems; Learning systems; Machine learning; Machine learning algorithms; Rough sets; Training data; Class imbalance learning; Instance weighting; Rough sets; Rule extraction; Weighted entropy;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2007 International Conference on
  • Conference_Location
    Hong Kong
  • Print_ISBN
    978-1-4244-0973-0
  • Electronic_ISBN
    978-1-4244-0973-0
  • Type

    conf

  • DOI
    10.1109/ICMLC.2007.4370789
  • Filename
    4370789