DocumentCode
2386709
Title
Addressing Missing Attributes during Data Mining Using Frequent Itemsets and Rough Set Based Predictions
Author
Li, Jiye ; Cercone, Nick ; Cohen, Robin
Author_Institution
York Univ., Toronto
fYear
2007
fDate
2-4 Nov. 2007
Firstpage
294
Lastpage
294
Abstract
In this paper, we present an improved method for predicting missing attribute values in data sets. We make use of frequent itemsets, generated from the association rules algorithm, displaying the correlations between different items in a set of transactions. In particular, we consider a database as a set of transactions and each data instance as an itemset. Then frequent itemsets can be used as a knowledge base to predict missing attribute values. Our approach integrates the RSFit method based on rough sets theory that produces faster predictions by considering similarities of attribute value pairs, but only for those attributes contained in the core or reduct of the data set. Using empirical studies on UCI and other real world data sets, we demonstrate a significant increase in prediction accuracy obtained from our new integrated approach, referred to as ItemRSFit.
Keywords
data mining; rough set theory; ItemRSFit; RSFit method; association rules algorithm; data mining; frequent itemsets; knowledge base; missing attribute value prediction; missing attributes; rough set based predictions; rough sets theory; Accuracy; Association rules; Data mining; Data preprocessing; Data privacy; Design for experiments; Itemsets; Rough sets; Testing; Transaction databases;
fLanguage
English
Publisher
ieee
Conference_Titel
Granular Computing, 2007. GRC 2007. IEEE International Conference on
Conference_Location
Fremont, CA
Print_ISBN
978-0-7695-3032-1
Type
conf
DOI
10.1109/GrC.2007.144
Filename
4403113
Link To Document