Title :
A pattern decomposition (PD) algorithm for finding all frequent patterns in large datasets
Author :
Zou, Qinghua ; Chu, Wesley ; Johnson, David ; Chiu, Henry
Author_Institution :
Dept. of Comput. Sci., California Univ., Los Angeles, CA, USA
Abstract :
Efficient algorithms to mine frequent patterns are crucial to many tasks in data mining. Since the Apriori algorithm was proposed (R. Agrawal and R. Srikant, 1994), there have been several methods proposed to improve its performance. However, most still adopt its candidate set generation-and-test approach. We propose a pattern decomposition (PD) algorithm that can significantly reduce the size of the dataset on each pass, making it more efficient to mine frequent patterns in a large dataset. The proposed algorithm avoids the costly process of candidate set generation and saves time by reducing dataset. Our empirical evaluation shows that the algorithm outperforms Apriori by one order of magnitude and is faster than FP-tree. Further, PD is more scalable than both Apriori and FP-tree
Keywords :
data mining; pattern recognition; set theory; very large databases; Apriori algorithm; FP-tree; candidate set generation; candidate set generation-and-test approach; data mining; frequent pattern mining; large datasets; pattern decomposition algorithm; Association rules; Computer science; Data mining; Itemsets;
Conference_Titel :
Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on
Conference_Location :
San Jose, CA
Print_ISBN :
0-7695-1119-8
DOI :
10.1109/ICDM.2001.989603