DocumentCode :
1847900
Title :
Performance of Left Outer Join on Hadoop with Right Side within Single Node Memory Size
Author :
Byambajav, Byambajargal ; Wlodarczyk, Tomasz Wiktor ; Rong, Chunming ; LePendu, Paea ; Shah, Nigam
Author_Institution :
Dept. of Comput. Sci. & Electr. Eng., Univ. of Stavanger Stavanger, Stavanger, Norway
fYear :
2012
fDate :
26-29 March 2012
Firstpage :
1075
Lastpage :
1080
Abstract :
In this paper we compare performance results of different implementations of join operation in Hadoop in a scenario where right side (of the join) is within single node memory size. We present results for several implementations both in pure Map Reduce and in Pig, both basing on HDFS. We also compare distributed performance of those implementations with a single node implementation in MySQL. Results show that Pig implementations do not match pure Map Reduce versions by a bigger margin than expected. Moreover, we notice that Map tasks seem to be the element that influences performance the most, especially for the potentially more efficient methods. Currently, we achieved the best performance using a singleton pattern join. However, there are reasons to believe that this performance can be still improved with better control of the amount of Map tasks.
Keywords :
SQL; parallel programming; storage management; HDFS; Hadoop; Map Reduce; MySQL; Pig; single node memory size; Bioinformatics; Computer architecture; Context; Indexes; Java; Ontologies; Semantics; Hadoop; Join; MapReduce; Semantic Expansion;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Information Networking and Applications Workshops (WAINA), 2012 26th International Conference on
Conference_Location :
Fukuoka
Print_ISBN :
978-1-4673-0867-0
Type :
conf
DOI :
10.1109/WAINA.2012.20
Filename :
6185392
Link To Document :
بازگشت