Efficient SPARQL Query Evaluation in a Database Cluster

Author

Fang Du ; Haoqiong Bian ; Yueguo Chen ; Xiaoyong Du

Author_Institution

DEKE Lab., Renmin Univ. of China, Beijing, China

fYear

2013

fDate

June 27 2013-July 2 2013

Firstpage

165

Lastpage

172

Abstract

Efficient SPARQL query evaluation is a significant challenge when the database contains billions of RDF triples, which is very common for many existing Web-scale RDF data sources. We address this challenge by 1) effectively partitioning the whole RDF dataset into small partitions according to the schemas of the RDF subjects, and 2) elaborately placing the partitions within clusters so that, on each local partition, we can make the most advantage of the state-of-the-art SPARQL query processing engine, and across the partitions, we can exploit the power of parallel databases for achieving scalable query evaluation of massive RDF data. This paper introduces the data partitioning and placement strategies, as well as the SPARQL query evaluation and optimization techniques in a cluster environment. Experiments are conducted over a synthesized dataset and a real dataset containing billions of triples. The results demonstrate that better query evaluation performance over the baseline can be achieved.

Keywords

query processing; relational databases; SPARQL query evaluation; SPARQL query optimization technique; Web-scale RDF data source; data partitioning strategy; data placement strategy; database cluster; query evaluation; resource description framework; Engines; Indexes; Optimization; Partitioning algorithms; Query processing; Resource description framework; RDF; SPARQL query; data partitioning; parallel database;

fLanguage

English

Publisher

ieee

Conference_Titel

Big Data (BigData Congress), 2013 IEEE International Congress on

Conference_Location

Santa Clara, CA

Print_ISBN

978-0-7695-5006-0

Type

conf

DOI

10.1109/BigData.Congress.2013.30

Filename

6597133