Title :
Map-Side Merge Joins for Scalable SPARQL BGP Processing
Author :
Przyjaciel-Zablocki, Martin ; Schaetzle, Alexander ; Skaley, Eduard ; Hornung, Thomas ; Lausen, Georg
Author_Institution :
Dept. of Comput. Sci., Univ. of Freiburg, Freiburg, Germany
Abstract :
In recent times, it has been widely recognized that, due to their inherent scalability, frameworks based on MapReduce are indispensable for so-called "Big Data" applications. However, for Semantic Web applications using SPARQL, there is still a demand for sophisticated MapReduce join techniques for processing basic graph patterns, which are at the core of SPARQL. Renowned for their stable and efficient performance, sort-merge joins have become widely used in DBMSs. In this paper, we demonstrate the adaptation of merge joins for SPARQL BGP processing with MapReduce. Our technique supports both n-way joins and sequences of join operations by applying merge joins within the map phase of MapReduce while the reduce phase is only used to fulfill the preconditions of a subsequent join iteration. Our experiments with the LUBM benchmark show an average performance benefit between 15% and 48% compared to other MapReduce based approaches while at the same time scaling linearly with the RDF dataset size.
Keywords :
query processing; semantic Web; very large databases; Big Data; MapReduce; basic graph patterns; scalable SPARQL BGP processing; semantic Web; Educational institutions; Information management; Layout; Pattern matching; Resource description framework; Sorting; Map-Side Merge Join; MapReduce; RDF; SPARQL;
Conference_Titel :
Cloud Computing Technology and Science (CloudCom), 2013 IEEE 5th International Conference on
Conference_Location :
Bristol
DOI :
10.1109/CloudCom.2013.9