Title :
A comparative study among three algorithms for frequent pattern generation
Author :
Islam, Rashed ; Khan, S.M. ; Azud us zaman, M. ; Kabir Robin, S.S.
Abstract :
Efficient algorithms to mine frequent patterns are crucial to many tasks in data mining. Since the Apriori algorithm was proposed in 1994, there have been several methods proposed to improve its performance. However, most still adopt its candidate set generationand- test approach. In addition, many methods do not generate all frequent patterns, making them inadequate to derive association rules. The Pattern Decomposition (PD) algorithm that can significantly reduce the size of the dataset on each pass makes it more efficient to mine all frequent patterns in a large dataset. This algorithm avoids the costly process of candidate set generation and saves a great amount of counting time to evaluate support with reduced datasets. In this paper, some existing frequent pattern generation algorithms are explored, their comparisons are discussed, which shows that the PD algorithm outperforms an improved version of Apriori named Direct Count of candidates & Prune transactions (DCP) by one order of magnitude and is faster than an improved FP-tree (Frequent Pattern) named as Predictive Item Pruning (PIP). Further, PD is also more scalable than the DCP and PIP.
Keywords :
Association rules; Computer science; Data engineering; Data mining; Data visualization; Database systems; Explosions; Itemsets; Knowledge representation; Visual databases;
Conference_Titel :
Machine Learning and Applications, 2004. Proceedings. 2004 International Conference on
Conference_Location :
Louisville, Kentucky, USA
Print_ISBN :
0-7803-8823-2
DOI :
10.1109/ICMLA.2004.1383532