• DocumentCode
    3438356
  • Title

    Accelerating Frequent Itemsets Mining on the Cloud: A MapReduce -Based Approach

  • Author

    Farzanyar, Zahra ; Cercone, Nick

  • Author_Institution
    Comput. Sci. & Eng. Dept., York Univ., Toronto, ON, Canada
  • fYear
    2013
  • fDate
    7-10 Dec. 2013
  • Firstpage
    592
  • Lastpage
    598
  • Abstract
    Frequent pattern mining has a critical role in mining associations, sequential patterns, correlations, causality, episodes, multidimensional patterns, emerging patterns, and many other significant data mining tasks. With the exponential growth of available data, most of the traditional frequent pattern mining algorithms become ineffective due to either huge resource requirements or large communications overhead. Cloud computing has proved that processing very large datasets over commodity clusters can be performed by providing the right programming model. As a parallel programming model, MapReduce, one of most important techniques for cloud computing, has emerged in the mining of datasets of terabyte scale or larger on clusters of computers. Converting a serial mining algorithm into a distributed algorithm on the MapReduce framework is not necessarily difficult, but the mining performance can be unsatisfactory. In this paper, we propose a method which finds all frequent item sets by using just two MapReduce phases in a time and communication efficient manner. We demonstrate experimental results to corroborate our theoretical claims.
  • Keywords
    cloud computing; data mining; distributed algorithms; parallel programming; MapReduce based approach; MapReduce framework; accelerating frequent itemsets mining; cloud computing; data mining associations; distributed algorithm; frequent pattern mining algorithms; parallel programming model; resource requirements; serial mining algorithm; Algorithm design and analysis; Cloud computing; Clustering algorithms; Computational modeling; Computers; Data mining; Itemsets; Big Data Mining; Cloud Computing; Frequent Itemset Mining; MapReduce;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops (ICDMW), 2013 IEEE 13th International Conference on
  • Conference_Location
    Dallas, TX
  • Print_ISBN
    978-1-4799-3143-9
  • Type

    conf

  • DOI
    10.1109/ICDMW.2013.106
  • Filename
    6753974