DocumentCode :
2182533
Title :
Profiling linked open data with ProLOD
Author :
Böhm, Christoph ; Naumann, Felix ; Abedjan, Ziawasch ; Fenz, Dandy ; Grütze, Toni ; Hefenbrock, Daniel ; Pohl, Matthias ; Sonnabend, David
Author_Institution :
Hasso-Plattner-Inst., Potsdam, Germany
fYear :
2010
fDate :
1-6 March 2010
Firstpage :
175
Lastpage :
178
Abstract :
Linked open data (LOD), as provided by a quickly growing number of sources constitutes a wealth of easily accessible information. However, this data is not easy to understand. It is usually provided as a set of (RDF) triples, often enough in the form of enormous files covering many domains. What is more, the data usually has a loose structure when it is derived from end-user generated sources, such as Wikipedia. Finally, the quality of the actual data is also worrisome, because it may be incomplete, poorly formatted, inconsistent, etc. To understand and profile such linked open data, traditional data profiling methods do not suffice. With ProLOD, we propose a suite of methods ranging from the domain level (clustering, labeling), via the schema level (matching, disambiguation), to the data level (data type detection, pattern detection, value distribution). Packaged into an interactive, web-based tool, they allow iterative exploration and discovery of new LOD sources. Thus, users can quickly gauge the relevance of the source for the problem at hand (e.g., some integration task), focus on and explore the relevant subset.
Keywords :
iterative methods; meta data; LOD; ProLOD; RDF; Web based tool; data profiling methods; data type detection; iterative exploration; linked open data; pattern detection; profiling linked open data; value distribution; Data analysis; Data visualization; Labeling; Ontologies; Packaging; Pattern matching; Prototypes; Resource description framework; Semantic Web; Wikipedia;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering Workshops (ICDEW), 2010 IEEE 26th International Conference on
Conference_Location :
Long Beach, CA
Print_ISBN :
978-1-4244-6522-4
Electronic_ISBN :
978-1-4244-6521-7
Type :
conf
DOI :
10.1109/ICDEW.2010.5452762
Filename :
5452762
Link To Document :
بازگشت