Title :
Scalable Semantics The Silver Lining of Cloud Computing
Author :
Newman, Andre ; Li, Yuan-Fang ; Hunter, Jane
Author_Institution :
Sch. of lTEE, Univ. of Queensland, Brisbane, QLD
Abstract :
Semantic inferencing and querying across large-scale RDF triple stores is notoriously slow. Our objective is to expedite this process by employing Google´s MapReduce framework to implement scale-out distributed querying and reasoning. This approach requires RDF graphs to be decomposed into smaller units that are distributed across computational nodes. RDF Molecules appear to offer an ideal approach - providing an intermediate level of granularity between RDF graphs and triples. However, the original RDF molecule definition has inherent limitations that will adversely affect performance. In this paper, we propose a number of extensions to RDF molecules (hierarchy and ordering) to overcome these limitations. We then present some implementation details for our MapReduce-based RDF molecule store. Finally we evaluate the benefits of our approach in the context of the Bio-MANTA project - an application that requires integration and querying across large-scale protein-protein interaction datasets.
Keywords :
Internet; inference mechanisms; query processing; semantic Web; BioMANTA project; MapReduce framework; RDF molecules; cloud computing; large-scale RDF triple stores; protein-protein interaction datasets; scalable semantics; semantic inferencing; Cloud computing; Computer architecture; Distributed computing; Distributed processing; Large scale integration; Large-scale systems; OWL; Resource description framework; Semantic Web; Silver; MapReduce; RDF; RDF molecules; data integration; distributed processing;
Conference_Titel :
eScience, 2008. eScience '08. IEEE Fourth International Conference on
Conference_Location :
Indianapolis, IN
Print_ISBN :
978-1-4244-3380-3
Electronic_ISBN :
978-0-7695-3535-7
DOI :
10.1109/eScience.2008.23