• DocumentCode
    612186
  • Title

    Self-organizing structured RDF in MonetDB

  • Author

    Pham, Minh-Tan

  • Author_Institution
    CWI, Amsterdam, Netherlands
  • fYear
    2013
  • fDate
    8-12 April 2013
  • Firstpage
    310
  • Lastpage
    313
  • Abstract
    The semantic web uses RDF as its data model, providing ultimate flexibility for users to represent and evolve data without need of a schema. Yet, this flexibility poses challenges in implementing efficient RDF stores, leading from plans with very many self-joins to a triple table, difficulties to optimize these, and a lack of data locality since without a notion of multi-attribute data structure, clustered indexing opportunities are lost. Apart from performance issues, users of huge RDF graphs often have problems formulating queries as they lack any system-supported notion of the structure in the data. In this research, we exploit the observation that real RDF data, while not as regularly structured as relational data, still has the great majority of triples conforming to regular patterns. We conjecture that a system that would recognize this structure automatically would both allow RDF stores to become more efficient and also easier to use. Concretely, we propose to derive self-organizing RDF that stores data in PSO format in such a way that the regular parts of the data physically correspond to relational columnar storage; and propose RDFscan/RDFjoin algorithms that compute star-patterns over these without wasting effort in self-joins. These regular parts, i.e. tables, are identified on ingestion by a schema discovery algorithm - as such users will gain an SQL view of the regular part of the RDF data. This research aims to produce a state-of-the-art SPARQL frontend for MonetDB as a by-product, and we already present some preliminary results on this platform.
  • Keywords
    SQL; data models; pattern clustering; query formulation; relational databases; self-organising feature maps; semantic Web; MonetDB; PSO format; RDF data; RDF graph; RDF store; RDFjoin algorithm; RDFscan algorithm; SPARQL; clustered indexing; data locality; data model; multiattribute data structure; query formulation; relational columnar storage; relational data; schema discovery algorithm; self-organizing RDF; self-organizing structured RDF; semantic Web; star-pattern; system-supported notion; Resource description framework;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering Workshops (ICDEW), 2013 IEEE 29th International Conference on
  • Conference_Location
    Brisbane, QLD
  • Print_ISBN
    978-1-4673-5303-8
  • Electronic_ISBN
    978-1-4673-5302-1
  • Type

    conf

  • DOI
    10.1109/ICDEW.2013.6547471
  • Filename
    6547471