Title :
Graph mining meets the Semantic Web
Author :
Lee, Sangkeun ; Sukumar, Sreenivas R. ; Seung-Hwan Lim
Author_Institution :
Comput. Sci. & Eng. Div., Oak Ridge Nat. Lab., Oak Ridge, TN, USA
Abstract :
The Resource Description Framework (RDF) and SPARQL Protocol and RDF Query Language (SPARQL) were introduced about a decade ago to enable flexible schema-free data interchange on the Semantic Web. Today, data scientists use the framework as a scalable graph representation for integrating, querying, exploring and analyzing data sets hosted at different sources. With increasing adoption, the need for graph mining capabilities for the Semantic Web has emerged. We address that need through implementation of three popular iterative Graph Mining algorithms (Triangle count, Connected component analysis, and PageRank). We implement these algorithms as SPARQL queries, wrapped within Python scripts. We evaluate the performance of our implementation on 6 real world data sets and show graph mining algorithms (that have a linear-algebra formulation) can indeed be unleashed on data represented as RDF graphs using the SPARQL query interface.
Keywords :
SQL; data mining; graph theory; iterative methods; linear algebra; query languages; semantic Web; Python scripts; RDF; RDF graphs; RDF query language; SPARQL protocol; SPARQL queries; SPARQL query interface; data scientists; flexible schema-free data interchange; linear-algebra formulation; popular iterative graph mining algorithms; resource description framework; scalable graph representation; semantic Web; Algorithm design and analysis; Communities; Data mining; Database languages; Resource description framework; Software algorithms;
Conference_Titel :
Data Engineering Workshops (ICDEW), 2015 31st IEEE International Conference on
Conference_Location :
Seoul
DOI :
10.1109/ICDEW.2015.7129544