DocumentCode
2193976
Title
A Distributed Algorithm for Determining the Provenance of Data
Author
Groth, Paul T.
Author_Institution
Inf. Sci. Inst., Univ. of Southern California, CA, USA
fYear
2008
fDate
7-12 Dec. 2008
Firstpage
166
Lastpage
173
Abstract
As computational techniques for tracking provenance have become more widely used, applications are beginning to produce large quantities of provenance information. Furthermore, many of these applications are composed from distributed components (e.g. scientific workflows) that may, for reasons of scalability, security or policy, need to store this information across multiple sites. In this paper, we describe an algorithm, D-PQuery, for determining the provenance of data from distributed sources of provenance information in a parallel fashion. To enable scientist to use D-PQuery on already existing Grid infrastructure, we present an implementation of the algorithm as a Condor DAGMan workflow that works across Kickstart records, which are produced in several production e-Science applications including the example application used in this paper, the astronomy application, Montage. Initial performance benchmarks are also presented.
Keywords
distributed algorithms; grid computing; query processing; scientific information systems; Condor DAGMan workflow; D-PQuery; Kickstart records; Montage; astronomy application; computational techniques; data provedence; distributed algorithm; distributed components; e-science applications; grid infrastructure; provenance information; scientific workflows; Access control; Astronomy; Clustering algorithms; Data security; Databases; Distributed algorithms; Distributed computing; Information security; Production; Scalability; distributed data; provenance; query algorithm;
fLanguage
English
Publisher
ieee
Conference_Titel
eScience, 2008. eScience '08. IEEE Fourth International Conference on
Conference_Location
Indianapolis, IN
Print_ISBN
978-1-4244-3380-3
Electronic_ISBN
978-0-7695-3535-7
Type
conf
DOI
10.1109/eScience.2008.38
Filename
4736754
Link To Document