• DocumentCode
    2386709
  • Title

    Addressing Missing Attributes during Data Mining Using Frequent Itemsets and Rough Set Based Predictions

  • Author

    Li, Jiye ; Cercone, Nick ; Cohen, Robin

  • Author_Institution
    York Univ., Toronto
  • fYear
    2007
  • fDate
    2-4 Nov. 2007
  • Firstpage
    294
  • Lastpage
    294
  • Abstract
    In this paper, we present an improved method for predicting missing attribute values in data sets. We make use of frequent itemsets, generated from the association rules algorithm, displaying the correlations between different items in a set of transactions. In particular, we consider a database as a set of transactions and each data instance as an itemset. Then frequent itemsets can be used as a knowledge base to predict missing attribute values. Our approach integrates the RSFit method based on rough sets theory that produces faster predictions by considering similarities of attribute value pairs, but only for those attributes contained in the core or reduct of the data set. Using empirical studies on UCI and other real world data sets, we demonstrate a significant increase in prediction accuracy obtained from our new integrated approach, referred to as ItemRSFit.
  • Keywords
    data mining; rough set theory; ItemRSFit; RSFit method; association rules algorithm; data mining; frequent itemsets; knowledge base; missing attribute value prediction; missing attributes; rough set based predictions; rough sets theory; Accuracy; Association rules; Data mining; Data preprocessing; Data privacy; Design for experiments; Itemsets; Rough sets; Testing; Transaction databases;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Granular Computing, 2007. GRC 2007. IEEE International Conference on
  • Conference_Location
    Fremont, CA
  • Print_ISBN
    978-0-7695-3032-1
  • Type

    conf

  • DOI
    10.1109/GrC.2007.144
  • Filename
    4403113