• DocumentCode
    2853496
  • Title

    Scientific workflow rewriting while preserving provenance

  • Author

    Cohen-Boulakia, Sarah ; Froidevaux, C. ; JiuQiang Chen

  • Author_Institution
    AMIB Group, Univ. Paris Sud, Paris, France
  • fYear
    2012
  • fDate
    8-12 Oct. 2012
  • Firstpage
    1
  • Lastpage
    9
  • Abstract
    Scientific workflow systems are numerous and equipped of provenance modules able to collect data produced and consumed during workflow runs to enhance reproducibility. An increasing number of approaches have been developed to help managing provenance information. Some of them are able to process data in a polynomial time but they require workflows to have series-parallel (SP) structures. Rewriting any workflow into an SP workflow is thus particularly important. In this paper, (i) we introduce the concept of provenance-equivalent rewriting process, (ii) we review existing graph transformations, (iii) we design the provenance-equivalent SPFlow algorithm, (iv) we evaluate our approach over a thousand of real workflows.
  • Keywords
    bioinformatics; computational complexity; data handling; graph theory; rewriting systems; SP workflow; bioinformatics experiments; data collection; graph transformations; polynomial time; provenance information management; provenance modules; provenance preservation; provenance-equivalent SPFlow algorithm; provenance-equivalent rewriting process; reproducibility enhancement; scientific workflow rewriting; scientific workflow systems; series-parallel structures; Bioinformatics; Complexity theory; Context; Data models; Educational institutions; History; Polynomials;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    E-Science (e-Science), 2012 IEEE 8th International Conference on
  • Conference_Location
    Chicago, IL
  • Print_ISBN
    978-1-4673-4467-8
  • Type

    conf

  • DOI
    10.1109/eScience.2012.6404419
  • Filename
    6404419