DocumentCode :
1618749
Title :
Optimization of multi-join query processing within MapReduce
Author :
Jiang, Miao ; Wang, Ye
Author_Institution :
Sch. of Comput. Sci., Fudan Univ., Shanghai, China
fYear :
2010
Firstpage :
77
Lastpage :
83
Abstract :
MapReduce is a programming model which is usually applied to process large-scale data. Many tasks can be implemented under the framework, such as data processing of search engines and machine learning. However, there is no efficient support for join operation in current implementations of MapReduce. Former work has studied Map-Reduce-Merge for join operator, however, because of the time cost in the Reduce phase, we argue it is better to omit the Reduce procedure along with the cost it brings for join implementation. In this paper, we design and implement a join algorithm on relational data in a MapReduce environment. Meanwhile, we present a method for join operator over many relations. We conduct a series of experiments to verify the effectiveness and efficiency of proposed methods.
Keywords :
parallel programming; query processing; MapReduce; map reduce merge; multijoin query processing optimization; programming model; relational data; Data mining; Data processing; Distributed databases; File systems; Mercury (metals); Program processors; Programming; MapReduce; cluster; distributed; join; relational data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Universal Communication Symposium (IUCS), 2010 4th International
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-7821-7
Type :
conf
DOI :
10.1109/IUCS.2010.5666765
Filename :
5666765
Link To Document :
بازگشت