DocumentCode :
3278414
Title :
Complex statistical analysis of big data: Implementation and application of Apriori and FP-Growth algorithm based on MapReduce
Author :
Zhuobo Rong ; Dawen Xia ; Zili Zhang
Author_Institution :
Sch. of Comput. & Inf. Sci., Southwest Univ., Chongqing, China
fYear :
2013
fDate :
23-25 May 2013
Firstpage :
968
Lastpage :
972
Abstract :
In the single machine environment, the problems of Apriori and FP-Growth algorithm in large-scale data association rules mining are high memory consumption, low computing performance, poor scalability and reliability and so on. Therefore, we put forward a new implementation method which is based on MapReduce parallel environment for mining frequent itemsets to generate association rules and is verified by using different sizes of real datasets with different nodes in the cluster, meanwhile, selecting “speedup, scalability and reliability” as an indicator. The results show that our method is feasible and valid and is able to improve the overall performance and efficiency of Apriori and FP-Growth algorithm to meet the needs of large-scale data association rules mining.
Keywords :
data analysis; data mining; parallel processing; statistical analysis; FP-Growth Algorithm; MapReduce parallel environment; apriori algorithm; complex statistical big data analysis; computing performance; frequent itemsets mining; large-scale data association rules mining; memory consumption; single machine environment; Educational institutions; Indexes; apriori; association analysis; big data statistics; fP-growth; mapreduce;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Software Engineering and Service Science (ICSESS), 2013 4th IEEE International Conference on
Conference_Location :
Beijing
ISSN :
2327-0586
Print_ISBN :
978-1-4673-4997-0
Type :
conf
DOI :
10.1109/ICSESS.2013.6615467
Filename :
6615467
Link To Document :
بازگشت