• DocumentCode
    2334595
  • Title

    Using rough sets theory and database operations to construct a good ensemble of classifiers for data mining applications

  • Author

    Hu, Xiaohua

  • Author_Institution
    Vigilance Inc., Sunnyvale, CA, USA
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    233
  • Lastpage
    240
  • Abstract
    The article presents a novel approach to constructing a good ensemble of classifiers using rough set theory and database operations. Ensembles of classifiers are formulated precisely within the framework of rough set theory and constructed very efficiently by using set-oriented database operations. Our method first computes a set of reducts which include all the indispensable attributes required for the decision categories. For each reduct, a reduct table is generated by removing those attributes which are not in the reduct. Next, a novel rule induction algorithm is used to compute the maximal generalized rules for each reduct table and a set of reduct classifiers is formed based on the corresponding reducts. The distinctive features of our method as compared to other methods of constructing ensembles of classifiers are: (1) presents a theoretical model to explain the mechanism of constructing ensemble of classifiers; (2) each reduct is a minimum subset of attributes and has the same classification ability as the entire attributes; (3) each reduct classifier constructed from the corresponding reduct has a minimal set of classification rules, and is as accurate and complete as possible and at the same time as diverse as possible from the other classifiers; (4) the test indicates that the number of classifiers used to improve the accuracy is much less than other methods
  • Keywords
    data mining; decision tables; inference mechanisms; pattern classification; rough set theory; very large databases; classification ability; classification rules; data mining applications; database operations; decision categories; indispensable attributes; maximal generalized rules; minimum subset; reduct classifier; reduct classifiers; reduct table; reducts; rough set theory; rule induction algorithm; set-oriented database operations; Bagging; Boosting; Data mining; Databases; Decision trees; Induction generators; Rough sets; Testing; Training data; Voting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on
  • Conference_Location
    San Jose, CA
  • Print_ISBN
    0-7695-1119-8
  • Type

    conf

  • DOI
    10.1109/ICDM.2001.989524
  • Filename
    989524