DocumentCode
2427938
Title
A Hybrid Join Algorithm on Top of Map Reduce
Author
Hu, Weisong ; Ma, Lili ; Liu, Xiaowei ; Qi, Hongwei ; Zha, Li ; Liao, Huaming ; Zhang, Yuezhou
fYear
2011
fDate
24-26 Oct. 2011
Firstpage
44
Lastpage
50
Abstract
Hadoop has shown great power in processing vast data in parallel. Hive, the database on Hadoop, enables more experts to process relational data by providing sql-like interface. However, Hive does not provide an efficient approach for join, a common but expensive operator in relational database. Due to the importance of join, this paper proposes a novel hybrid algorithm, HJA, which can help to automatically choose the relatively better one among several methods, divide and memory copy merge, Partition Join(PJ) and naïve Hive join. Experiments show that HJA can get best performance in most situations.
Keywords
SQL; parallel processing; relational databases; HJA; Hadoop; MapReduce; Partition Join; SQL-like interface; naive Hive join; relational database; Semantics; Hadoop; MapReduce; auto-tuning; partition join;
fLanguage
English
Publisher
ieee
Conference_Titel
Semantics Knowledge and Grid (SKG), 2011 Seventh International Conference on
Conference_Location
Beijing
Print_ISBN
978-1-4577-1323-1
Type
conf
DOI
10.1109/SKG.2011.13
Filename
6088090
Link To Document