DocumentCode :
3165626
Title :
Parallel Mining of Frequent Closed Patterns: Harnessing Modern Computer Architectures
Author :
Lucchese, Claudio ; Orlando, Salvatore ; Perego, Raffaele
Author_Institution :
Ca´´ Foscari Univ. Venice, Venice
fYear :
2007
fDate :
28-31 Oct. 2007
Firstpage :
242
Lastpage :
251
Abstract :
Inspired by emerging multi-core computer architectures, in this paper we present MT_CLOSED, a multi-threaded algorithm for frequent closed itemset mining (FCIM). To the best of our knowledge, this is the first FCIM parallel algorithm proposed so far. We studied how different duplicate checking techniques, typical of FCIM algorithms, may affect this parallelization. We showed that only one of them allows to decompose the global FCIM problem into independent tasks that can be executed in any order, and thus in parallel. Finally we show how MT_Closed efficiently harness modern CPUs. We designed and tested several parallelization paradigms by investigating static/dynamic decomposition and scheduling of tasks, thus showing its scalability w.r.t. to the number of CPUs. We analyzed the cache friendliness of the algorithm. Finally, we provided additional speed-up by introducing SIMD extensions.
Keywords :
computer architecture; data mining; multi-threading; parallel algorithms; scheduling; MT_CLOSED; central processing unit; duplicate checking; frequent closed itemset mining; frequent closed pattern; multicore computer architecture; multithreaded algorithm; parallel algorithm; parallel mining; parallelization; task scheduling; Algorithm design and analysis; Computer architecture; Coprocessors; Data mining; Data structures; Dynamic scheduling; Itemsets; Parallel algorithms; Parallel processing; Yarn;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
Conference_Location :
Omaha, NE
ISSN :
1550-4786
Print_ISBN :
978-0-7695-3018-5
Type :
conf
DOI :
10.1109/ICDM.2007.13
Filename :
4470248
Link To Document :
بازگشت