DocumentCode :
2373424
Title :
Mining frequent closed itemsets for large data
Author :
Huaiguo Fu ; Mephu Nguifo, E.
fYear :
2004
fDate :
16-18 Dec. 2004
Firstpage :
328
Lastpage :
335
Abstract :
Mining frequent closed itemsets is one effective method to analyse frequent pattern, and further, to generate association rules. Several algorithms were proposed to generate frequent closed itemsets, including CLOSE, A-CLOSE, CLOSET, CHARM and CLOSET + etc. However it´s still hard for these algorithms to deal with dense and very large data. In this paper, we analyze the search space of frequent closed itemsets and propose a new decomposition algorithm for mining frequent closed itemsets called PFC. PFC can dynamically generate non-overlapping partitions of the search space and mine frequent closed itemsets in each partition. Furthermore, each partition is independent and only shares the same source data with other partitions. So it is possible to implement PFC with multi-threads or parallel methods, and prune efficiently the search space of frequent closed itemsets. In this study, P FC is implemented in Java. We compare PFC with an author´s C++ version of CLOSET + on some large VCI repository datasets and on the worst case. The preliminary experimental results demonstrate good performance of PFC for dealing with dense and very large data.
Keywords :
Association rules; Data analysis; Data mining; Databases; Itemsets; Java; Lattices; Lenses; Partitioning algorithms; Pattern analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications, 2004. Proceedings. 2004 International Conference on
Conference_Location :
Louisville, Kentucky, USA
Print_ISBN :
0-7803-8823-2
Type :
conf
DOI :
10.1109/ICMLA.2004.1383531
Filename :
1383531
Link To Document :
بازگشت