DocumentCode
2496837
Title
P-Mine: Parallel itemset mining on large datasets
Author
Baralis, Elena ; Cerquitelli, Tania ; Chiusano, Silvia ; Grand, Anais
Author_Institution
Dipt. di Autom. e Inf., Politec. di Torino, Turin, Italy
fYear
2013
fDate
8-12 April 2013
Firstpage
266
Lastpage
271
Abstract
Itemset mining is a well-known exploratory technique used to discover interesting correlations hidden in a data collection. Since ever increasing amounts of data are being collected and stored (e.g., business transactions, medical and biological data, context-aware applications), scalable and efficient approaches are needed to analyzing these large data collections. This paper proposes a parallel disk-based approach to efficiently supporting frequent itemset mining on a multi-core processor. Our parallel strategy is presented in the context of the VLDB-Mine persistent data structure. Different techniques have been proposed to optimize both data- and compute-intensive aspects of the mining algorithm. Preliminary experiments, performed on both real and synthetic datasets, show promising results in improving the efficiency and scalability of the mining activity on large datasets.
Keywords
data mining; data structures; multiprocessing systems; parallel processing; P-Mine; VLDB- Mine persistent data structure; data collection; frequent itemset mining; large datasets; multicore processor; parallel disk-based approach; parallel itemset mining; parallel strategy; real datasets; synthetic datasets; Data mining; Data structures; Itemsets; Multicore processing; Prefetching; Scalability;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering Workshops (ICDEW), 2013 IEEE 29th International Conference on
Conference_Location
Brisbane, QLD
Print_ISBN
978-1-4673-5303-8
Electronic_ISBN
978-1-4673-5302-1
Type
conf
DOI
10.1109/ICDEW.2013.6547461
Filename
6547461
Link To Document