Author_Institution :
Dept. of Inf. Technol., Uppsala Univ., Uppsala, Sweden
Abstract :
Multidimensional array data, such as remote-sensing imagery and timeseries, climate model simulations, telescope observations, and medical images, contribute massively to virtually all science and engineering domains, and hence play a key role in ´Big Data´ challenges. Pure array storage management and analytics is relatively well understood today. However, arrays in practice never come alone, but are accompanied by metadata, including domain, range, provenance information, etc. The structure of this metadata is far less regular than arrays or tables, and may be incomplete or different from one array instance to another. Particularly in the field of the Semantic Web such integrated representations must convey a sufficiently complete and reasonable semantics for machine-machine communication. We show how the Resource Description Framework (RDF), the Semantic Web graph model for metadata, can be leveraged for such data/metadata integration specifically for representing spatio-temporal grid data. Based on the notion of a coverage as established by the Open Geospatial Consortium (OGC) we present a hybrid data store where efficiently represented arrays are incorporated as nodes into RDF graphs and connected to their metadata. We have extended the Semantic Web query language SPARQL to incorporate array query semantics and other functionality making it suitable for processing of large numeric arrays, including geo coverages.
Keywords :
"Arrays","Resource description framework","Metadata","Databases","Semantics"