Title :
A vision for SPARQL multi-query optimization on MapReduce
Author_Institution :
Dept. of Comput. Sci., North Carolina State Univ., Raleigh, NC, USA
Abstract :
MapReduce has emerged as a key component of large scale data analysis in the cloud. However, it presents challenges for SPARQL query processing because of the absence of traditional join optimization machinery like statistics, indexes and techniques for translation of join-intensive workloads to efficient MapReduce workflows. Further, MapReduce is primarily a batch processing paradigm. Therefore, it is plausible that many workloads will include a batch of queries or new queries could be generated from given queries e.g. due to query rewriting of inferencing queries. Consequently, the issue of multi-query optimization deserves some focus and this paper lays out a vision for rule-based multi-query optimization based on a recently proposed data model and algebra, Nested TripleGroup Data Model and Algebra, for efficient SPARQL query processing on MapReduce.
Keywords :
batch processing (computers); cloud computing; data analysis; data models; inference mechanisms; knowledge based systems; process algebra; query languages; query processing; rewriting systems; MapReduce workflows; Nested TripleGroup Data Model and Algebra; SPARQL multiquery optimization; SPARQL query processing; batch processing; cloud data analysis; data model; inferencing queries; join-intensive workload translation; large scale data analysis; query rewriting; rule-based multiquery optimization; Algebra; Cloud computing; Conferences; Data models; Optimization; Pattern matching; Query processing;
Conference_Titel :
Data Engineering Workshops (ICDEW), 2013 IEEE 29th International Conference on
Conference_Location :
Brisbane, QLD
Print_ISBN :
978-1-4673-5303-8
Electronic_ISBN :
978-1-4673-5302-1
DOI :
10.1109/ICDEW.2013.6547420