مرکز منطقه ای اطلاع رساني علوم و فناوري - A Datalog Engine for Iterative Graph Algorithms on Large Clusters

DocumentCode :

3739806

Title :

A Datalog Engine for Iterative Graph Algorithms on Large Clusters

Author :

Jacek Sroka;Marek Rogala;Michal Adamczyk;Jan Hidders

Author_Institution :

Univ. of Warsaw, Warsaw, Poland

fYear :

2015

Firstpage :

113

Lastpage :

114

Abstract :

Distributed computations on graphs gained importance with the emergence of large graphs, e.g., in the web or social networks. Frameworks like Hadoop, Giraph and Spark are used for their processing. Yet, they require advanced programming techniques to minimize skew and data shuffling. Declarative, query-like, but at the same time efficient solutions like Pig for general purpose analytics are lacking. In this paper we promote the use of declarative datalog with aggregation for large graph processing. We presents an implementation which extends tApache Spark with the capability of executing datalog queries. This approach makes it possible to express graph algorithms in a well studied declarative query language and execute them on an existing and mature infrastructure for distributed computation. At the same time the data processed with datalog queries is fully integrated with the caching mechanism of Spark and can be part of a larger iterative algorithm.

Keywords :

"Sparks","Optimization","Clustering algorithms","Programming","Iterative methods","Semantics","Conferences"

Publisher :

ieee

Conference_Titel :

Data Science and Data Intensive Systems (DSDIS), 2015 IEEE International Conference on

Type :

conf

DOI :

10.1109/DSDIS.2015.76

Filename :

7396490

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3739806