• DocumentCode
    2496837
  • Title

    P-Mine: Parallel itemset mining on large datasets

  • Author

    Baralis, Elena ; Cerquitelli, Tania ; Chiusano, Silvia ; Grand, Anais

  • Author_Institution
    Dipt. di Autom. e Inf., Politec. di Torino, Turin, Italy
  • fYear
    2013
  • fDate
    8-12 April 2013
  • Firstpage
    266
  • Lastpage
    271
  • Abstract
    Itemset mining is a well-known exploratory technique used to discover interesting correlations hidden in a data collection. Since ever increasing amounts of data are being collected and stored (e.g., business transactions, medical and biological data, context-aware applications), scalable and efficient approaches are needed to analyzing these large data collections. This paper proposes a parallel disk-based approach to efficiently supporting frequent itemset mining on a multi-core processor. Our parallel strategy is presented in the context of the VLDB-Mine persistent data structure. Different techniques have been proposed to optimize both data- and compute-intensive aspects of the mining algorithm. Preliminary experiments, performed on both real and synthetic datasets, show promising results in improving the efficiency and scalability of the mining activity on large datasets.
  • Keywords
    data mining; data structures; multiprocessing systems; parallel processing; P-Mine; VLDB- Mine persistent data structure; data collection; frequent itemset mining; large datasets; multicore processor; parallel disk-based approach; parallel itemset mining; parallel strategy; real datasets; synthetic datasets; Data mining; Data structures; Itemsets; Multicore processing; Prefetching; Scalability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering Workshops (ICDEW), 2013 IEEE 29th International Conference on
  • Conference_Location
    Brisbane, QLD
  • Print_ISBN
    978-1-4673-5303-8
  • Electronic_ISBN
    978-1-4673-5302-1
  • Type

    conf

  • DOI
    10.1109/ICDEW.2013.6547461
  • Filename
    6547461