Title :
Smart Cache: An Optimized MapReduce Implementation of Frequent Itemset Mining
Author :
Dachuan Huang ; Yang Song ; Routray, Ramani ; Feng Qin
Abstract :
Frequent Item set Mining (FIM) is a classic data mining topic with many real world applications such as market basket analysis. Many algorithms including Apriori, FP-Growth, and Eclat were proposed in the FIM field. As the dataset size grows, researchers have proposed MapReduce version of FIM algorithms to meet the big data challenge. This paper proposes new improvements to the MapReduce implementation of FIM algorithm by introducing a cache layer and a selective online analyzer. We have evaluated the effectiveness and efficiency of Smart Cache via extensive experiments on four public datasets. Smart Cache can reduce on average 45.4%, and up to 97.0% of the total execution time compared with the state-of-the-art solution.
Keywords :
Big Data; cache storage; data mining; Big Data challenge; FIM; cache layer; data mining; frequent itemset mining; optimized MapReduce implementation; selective online analyzer; smart cache; Algorithm design and analysis; Data mining; Itemsets; Libraries; Linear regression; Machine learning algorithms; Turning;
Conference_Titel :
Cloud Engineering (IC2E), 2015 IEEE International Conference on
Conference_Location :
Tempe, AZ
DOI :
10.1109/IC2E.2015.12