A comparative study among three algorithms for frequent pattern generation

Author

Islam, Rashed ; Khan, S.M. ; Azud us zaman, M. ; Kabir Robin, S.S.

fYear

2004

fDate

16-18 Dec. 2004

Firstpage

336

Lastpage

343

Abstract

Efficient algorithms to mine frequent patterns are crucial to many tasks in data mining. Since the Apriori algorithm was proposed in 1994, there have been several methods proposed to improve its performance. However, most still adopt its candidate set generationand- test approach. In addition, many methods do not generate all frequent patterns, making them inadequate to derive association rules. The Pattern Decomposition (PD) algorithm that can significantly reduce the size of the dataset on each pass makes it more efficient to mine all frequent patterns in a large dataset. This algorithm avoids the costly process of candidate set generation and saves a great amount of counting time to evaluate support with reduced datasets. In this paper, some existing frequent pattern generation algorithms are explored, their comparisons are discussed, which shows that the PD algorithm outperforms an improved version of Apriori named Direct Count of candidates & Prune transactions (DCP) by one order of magnitude and is faster than an improved FP-tree (Frequent Pattern) named as Predictive Item Pruning (PIP). Further, PD is also more scalable than the DCP and PIP.

Keywords

Association rules; Computer science; Data engineering; Data mining; Data visualization; Database systems; Explosions; Itemsets; Knowledge representation; Visual databases;

fLanguage

English

Publisher

ieee

Conference_Titel

Machine Learning and Applications, 2004. Proceedings. 2004 International Conference on

Conference_Location

Louisville, Kentucky, USA

Print_ISBN

0-7803-8823-2

Type

conf

DOI

10.1109/ICMLA.2004.1383532

Filename

1383532