• DocumentCode
    2193702
  • Title

    User Friendly Management of Workflow Results: From Provenance Information to Grid Logical File Names

  • Author

    Glatard, Tristan ; Olabarriaga, Sílvia D.

  • fYear
    2008
  • fDate
    7-12 Dec. 2008
  • Firstpage
    103
  • Lastpage
    110
  • Abstract
    Grid workflows can produce thousands of results that should be properly organised to enable further analysis. Typically results are stored on locations hard-coded in the workflow or in the components, limiting reusability. In this paper we present an approach to (re)organise the output files generated by a grid workflow in a distributed storage environment. We propose to perform a post-mortem mapping of workflow results into a directory structure. This mapping is based on data provenance information and exploits grid catalog features, namely logical file names, to avoid data replication. By defining different mappings, users can generate their own semantic view of results generated during a workflow execution, which fosters user-friendliness, whereas preserving workflow reusability. An implementation on the Virtual Resource Browser (VBrowser) framework is detailed and evaluated on neuroimaging workflows. Results show that the complex directory structure of an image analysis application cane properly generated by our system. An initial performance evaluation of the mapping resolution and directory structure creation indicates that this approach provides a practical, simple, yet powerful solution to an important roadblock for the adoption of workflows to implement complex image analysis pipelines.
  • Keywords
    file organisation; grid computing; medical image processing; replicated databases; software reusability; Virtual Resource Browser framework; data provenance information; data replication; distributed storage environment; grid catalog features; grid logical file names; grid workflows; image analysis; neuroimaging workflows; performance evaluation; workflow management; workflow reusability; Application software; Bioinformatics; Biomedical imaging; Biomedical informatics; Conference management; Image analysis; Information analysis; Mesh generation; Neuroimaging; Storage automation; Workflow; grid; large result sets; logical file catalog; provenance;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    eScience, 2008. eScience '08. IEEE Fourth International Conference on
  • Conference_Location
    Indianapolis, IN
  • Print_ISBN
    978-1-4244-3380-3
  • Electronic_ISBN
    978-0-7695-3535-7
  • Type

    conf

  • DOI
    10.1109/eScience.2008.31
  • Filename
    4736746