DocumentCode :
402869
Title :
Scalable algorithm for mining maximal frequent itemsets
Author :
Li, Qing-hua ; Wang, Hui ; He, Ye ; Jiang, Sheng-yi
Author_Institution :
Sch. of Comput., Huazhong Univ. of Sci. & Technol., Wuhan, China
Volume :
1
fYear :
2003
fDate :
2-5 Nov. 2003
Firstpage :
143
Abstract :
The discovery of frequent itemsets is a very computational and I/O intensive task, and beyond a certain database size, it is crucial to leverage and the combined computational power of multiple processors for fast response and scalability. In this paper we present new scalable algorithm for maximal frequent itemset mining. It decomposes the search space by prefix-based equivalence classes, distributes work among the processors and selectively duplicates databases in such a way that each processor can compute the maximal frequent itemsets independently. It utilizes multiple level backtrack pruning strategy, along with vertical database format, counting frequency by simple tid-list intersection operation. These techniques eliminate the need for synchronization, drastically cutting down the communication cost. The analysis and experimental results demonstrate that our approach is well scalable in speedup and sizeup.
Keywords :
data mining; database management systems; equivalence classes; database duplication; decomposition technique; independent classes; maximal frequent itemsets mining; multiple level backtrack pruning strategy; multiple processors; parallel mining; prefix-based equivalence classes; scalable algorithm; search space; tid-list intersection operation; vertical database format; Algorithm design and analysis; Costs; Data mining; Databases; Distributed computing; Educational institutions; Frequency synchronization; Itemsets; Partitioning algorithms; Scalability;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2003 International Conference on
Print_ISBN :
0-7803-8131-9
Type :
conf
DOI :
10.1109/ICMLC.2003.1264459
Filename :
1264459
Link To Document :
بازگشت