Title :
Mining Association Rules from Data with Missing Values by Database Partitioning and Merging
Author :
Shintani, Takahiko
Author_Institution :
Central Res. Lab., Hitachi Ltd., Tokyo
Abstract :
Often, real world applications contain many missing values. In mining association rules from real datasets, treating missing values is an important problem. In this paper, we propose a pattern-growth based algorithm for mining association rules from data with missing values. No data imputations are performed. Each association rule is evaluated using all the data records with which attributes of it are not missing values. Our algorithm partitions the database so that the data record with which the same attributes contain missing values is assigned to the same database partition, and the algorithm mines association rules by combining these database partitions. We propose methods of reducing processing workload: estimating the upper bound of global support using local supports, reutilizing part of the constructed tree structure, and merging redundant database partitions. Our performance study shows that our algorithm is efficient and can always find all association rules
Keywords :
data mining; merging; tree data structures; very large databases; association rule mining; data mining; database merging; database partitioning; pattern-growth based algorithm; tree structure; very large database; Association rules; Data mining; Databases; Diseases; Laboratories; Medical treatment; Merging; Partitioning algorithms; Tree data structures; Upper bound;
Conference_Titel :
Computer and Information Science, 2006 and 2006 1st IEEE/ACIS International Workshop on Component-Based Software Engineering, Software Architecture and Reuse. ICIS-COMSAR 2006. 5th IEEE/ACIS International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
0-7695-2613-6
DOI :
10.1109/ICIS-COMSAR.2006.60