Title :
Distributed mining of maximal frequent itemsets from databases on a cluster of workstations
Author :
Chung, Soon M. ; Luo, Congnan
Author_Institution :
Dept. of Comput. Sci. & Eng., Wright State Univ., Dayton, OH, USA
Abstract :
In this paper, we propose a new algorithm, named Distributed Max-Miner (DMM), for mining maximal frequent itemsets from databases. A frequent itemset is maximal if none of its supersets is frequent. DMM requires very low communication and synchronization overhead in distributed computing systems. DMM has the local mining phase and the global mining phase. During the local mining phase, each node mines the local database to discover the local maximal frequent itemsets, then they form a set of maximal candidate itemsets for the top-down search in the subsequent global mining phase. A new prefix-tree data structure is developed to facilitate the storage and counting of the global candidate itemsets of different sizes. This global mining phase using the prefix-tree can work with any local mining algorithm. We implemented DMM on a cluster of workstations and evaluated its performance for various cases. DMM demonstrates better performance than other sequential and parallel algorithms, and its performance is quite scalable, even when there are large maximal frequent itemsets (i.e., long patterns) in databases.
Keywords :
data mining; distributed algorithms; distributed databases; octrees; performance evaluation; tree searching; workstation clusters; DMM; Distributed Max-Miner; cluster of workstations; distributed computing; distributed mining; local database; maximal frequent itemsets; performance evaluation; prefix-tree data structure; top-down search; Association rules; Clustering algorithms; Computer science; Data engineering; Data mining; Distributed computing; Distributed databases; Itemsets; Parallel algorithms; Workstations;
Conference_Titel :
Cluster Computing and the Grid, 2004. CCGrid 2004. IEEE International Symposium on
Print_ISBN :
0-7803-8430-X
DOI :
10.1109/CCGrid.2004.1336638