Title :
Parallel implementation of WAP-tree mining algorithm
Author :
Wu, Ming ; Chung, Moon Jung ; Moonesinghe, H.D.K.
Author_Institution :
Dept. of CSE, Michigan State Univ., USA
Abstract :
In this paper, we present parallel algorithms for Web log mining and the performance prediction model. The algorithm, based on WAP-tree, scans dataset only twice and avoids candidate generation process. We parallelized mining part of WAP tree. To balance the workload among processors, we developed a task scheduling strategy. A performance model of parallel Web mining algorithm is also developed to predict the performance of parallel implementation. This model shows that we can get linear speedup for a small number of processors, and a slow down of speedup as the number of processors increases. Using the performance model, we can also estimate the maximum speed up. We implemented the algorithm on a Pittsburg Super Computer Center Lemieux using up to 48 processors. Our benchmark results showed that the performance model correctly predicts the performance of the parallel implementation. We have achieved a good speedup as the size of the dataset is increased.
Keywords :
Internet; data mining; parallel algorithms; performance evaluation; processor scheduling; protocols; resource allocation; Pittsburg Super Computer Center Lemieux; WAP-tree mining; Web log mining; candidate generation process; dataset scanning; linear processing speedup; maximum speed up; parallel Web mining; parallel algorithm; performance prediction model; task scheduling; workload balancing; Costs; Data mining; Explosives; Information analysis; Moon; Parallel algorithms; Predictive models; Processor scheduling; Web mining; Web sites;
Conference_Titel :
Parallel and Distributed Systems, 2004. ICPADS 2004. Proceedings. Tenth International Conference on
Print_ISBN :
0-7695-2152-5
DOI :
10.1109/ICPADS.2004.1316089