مرکز منطقه ای اطلاع رساني علوم و فناوري - Scalable algorithm for mining maximal frequent itemsets

DocumentCode :

402869

Title :

Scalable algorithm for mining maximal frequent itemsets

Author :

Li, Qing-hua ; Wang, Hui ; He, Ye ; Jiang, Sheng-yi

Author_Institution :

Sch. of Comput., Huazhong Univ. of Sci. & Technol., Wuhan, China

Volume :

fYear :

2003

fDate :

2-5 Nov. 2003

Firstpage :

143

Abstract :

The discovery of frequent itemsets is a very computational and I/O intensive task, and beyond a certain database size, it is crucial to leverage and the combined computational power of multiple processors for fast response and scalability. In this paper we present new scalable algorithm for maximal frequent itemset mining. It decomposes the search space by prefix-based equivalence classes, distributes work among the processors and selectively duplicates databases in such a way that each processor can compute the maximal frequent itemsets independently. It utilizes multiple level backtrack pruning strategy, along with vertical database format, counting frequency by simple tid-list intersection operation. These techniques eliminate the need for synchronization, drastically cutting down the communication cost. The analysis and experimental results demonstrate that our approach is well scalable in speedup and sizeup.

Keywords :

data mining; database management systems; equivalence classes; database duplication; decomposition technique; independent classes; maximal frequent itemsets mining; multiple level backtrack pruning strategy; multiple processors; parallel mining; prefix-based equivalence classes; scalable algorithm; search space; tid-list intersection operation; vertical database format; Algorithm design and analysis; Costs; Data mining; Databases; Distributed computing; Educational institutions; Frequency synchronization; Itemsets; Partitioning algorithms; Scalability;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Machine Learning and Cybernetics, 2003 International Conference on

Print_ISBN :

0-7803-8131-9

Type :

conf

DOI :

10.1109/ICMLC.2003.1264459

Filename :

1264459

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=402869