• DocumentCode
    589159
  • Title

    MapReduce-based Closed Frequent Itemset Mining with Efficient Redundancy Filtering

  • Author

    Su-Qi Wang ; Yu-Bin Yang ; Guang-Peng Chen ; Yang Gao ; Yao Zhang

  • Author_Institution
    State Key Lab. for Novel Software Technol., Nanjing Univ., Nanjing, China
  • fYear
    2012
  • fDate
    10-10 Dec. 2012
  • Firstpage
    449
  • Lastpage
    453
  • Abstract
    Mining closed frequent item set(CFI) plays a fundamental role in many real-world data mining applications. However, memory requirement and computational cost have become the bottleneck of CFI mining algorithms, particularly when confronting with large scale datasets, which herewith makes mining closed frequent item set from large scale datasets a significant and challenging issue. To address the above issue, a parallelized AFOPT-close algorithm is proposed and implemented in this paper based on the cloud computing framework MapReduce, which is widely used to cope with large scale data. Furthermore, an efficient parallelized method for checking if a frequent item set is globally closed is also proposed on the MapReduce platform to further improve the mining performance. Experimental results are then provided and analyzed to verify the efficiency and effectiveness of the proposed methods for mining closed frequent item set.
  • Keywords
    cloud computing; data mining; information filtering; parallel algorithms; CFI mining algorithm; MapReduce cloud computing framework; MapReduce-based closed frequent itemset mining; computational cost; data mining applications; large scale datasets; memory requirement; parallelized AFOPT-close algorithm; redundancy filtering; Algorithm design and analysis; Clustering algorithms; Conferences; Data mining; Itemsets; Redundancy; Scalability; AFOPT-close; Hadoop; MapReduce; closed frequent itemset; data mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops (ICDMW), 2012 IEEE 12th International Conference on
  • Conference_Location
    Brussels
  • Print_ISBN
    978-1-4673-5164-5
  • Type

    conf

  • DOI
    10.1109/ICDMW.2012.24
  • Filename
    6406474