DocumentCode :
2368250
Title :
Executing multiple group-by query in a MapReduce approach
Author :
Pan, Jie ; Magoulès, Frédéric ; Le Biannic, Yann
Author_Institution :
Ecole Centrale Paris, Châtenay-Malabry, France
Volume :
2
fYear :
2010
fDate :
June 29 2010-July 1 2010
Firstpage :
38
Lastpage :
41
Abstract :
Facing more and more generated information, data analysis software meets the challenge of processing large volume of data. The arrival of MapReduce provides a chance to utilize commodity hardware for processing large data set in parallel. In this paper, we focus on a special type of data analysis query, namely, multiple group-by query. We give an initial implementation of multiple group-by query based on MapReduce model. Considering the ignorable communication cost, we then propose an optimized version based on MapCombineReduce model, which addresses this issue. Our optimized version shows a better accelerating ability and a better scalability than the initial version.
Keywords :
data analysis; parallel processing; query processing; very large databases; MapCombineReduce model; commodity hardware; communication cost; data analysis software; group-by query; large volume data processing; parallel processing; Analytical models; Biological system modeling; Computational modeling; Monitoring; MapCombineReduce; MapReduce; multiple group by query;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communication Systems, Networks and Applications (ICCSNA), 2010 Second International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4244-7475-2
Type :
conf
DOI :
10.1109/ICCSNA.2010.5588949
Filename :
5588949
Link To Document :
بازگشت