مرکز منطقه ای اطلاع رساني علوم و فناوري - Optimization of multi-join query processing within MapReduce

DocumentCode :

1618749

Title :

Optimization of multi-join query processing within MapReduce

Author :

Jiang, Miao ; Wang, Ye

Author_Institution :

Sch. of Comput. Sci., Fudan Univ., Shanghai, China

fYear :

2010

Firstpage :

Lastpage :

Abstract :

MapReduce is a programming model which is usually applied to process large-scale data. Many tasks can be implemented under the framework, such as data processing of search engines and machine learning. However, there is no efficient support for join operation in current implementations of MapReduce. Former work has studied Map-Reduce-Merge for join operator, however, because of the time cost in the Reduce phase, we argue it is better to omit the Reduce procedure along with the cost it brings for join implementation. In this paper, we design and implement a join algorithm on relational data in a MapReduce environment. Meanwhile, we present a method for join operator over many relations. We conduct a series of experiments to verify the effectiveness and efficiency of proposed methods.

Keywords :

parallel programming; query processing; MapReduce; map reduce merge; multijoin query processing optimization; programming model; relational data; Data mining; Data processing; Distributed databases; File systems; Mercury (metals); Program processors; Programming; MapReduce; cluster; distributed; join; relational data;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Universal Communication Symposium (IUCS), 2010 4th International

Conference_Location :

Beijing

Print_ISBN :

978-1-4244-7821-7

Type :

conf

DOI :

10.1109/IUCS.2010.5666765

Filename :

5666765

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1618749