DocumentCode :
3122683
Title :
Differencing Provenance in Scientific Workflows
Author :
Bao, Zhuowei ; Cohen-Boulakia, Sarah ; Davidson, Susan B. ; Eyal, Anat ; Khanna, Sanjeev
Author_Institution :
Dept. of Comput. & Inf. Sci., Univ. of Pennsylvania, Philadelphia, PA
fYear :
2009
fDate :
March 29 2009-April 2 2009
Firstpage :
808
Lastpage :
819
Abstract :
Scientific workflow management systems are increasingly providing the ability to manage and query the provenance of data products. However, the problem of differencing the provenance of two data products produced by executions of the same specification has not been adequately addressed. Although this problem is NP-hard for general workflow specifications, an analysis of real scientific (and business) workflows shows that their specifications can be captured as series-parallel graphs overlaid with well-nested forking and looping. For this natural restriction, we present efficient, polynomial-time algorithms for differencing executions of the same specification and thereby understanding the difference in the provenance of their data products. We then describe a prototype called PDiffView built around our differencing algorithm. Experimental results demonstrate the scalability of our approach using collected, real workflows and increasingly complex runs.
Keywords :
computational complexity; formal specification; graph theory; workflow management software; NP-hard problem; data products; general workflow specifications; polynomial-time algorithms; scientific workflow management systems; series-parallel graphs; Conference management; Data engineering; Engineering management; Information science; Polynomials; Proteins; Prototypes; Scalability; USA Councils; Workflow management software;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2009. ICDE '09. IEEE 25th International Conference on
Conference_Location :
Shanghai
ISSN :
1084-4627
Print_ISBN :
978-1-4244-3422-0
Electronic_ISBN :
1084-4627
Type :
conf
DOI :
10.1109/ICDE.2009.103
Filename :
4812456
Link To Document :
بازگشت