DocumentCode :
3675977
Title :
A Quantitative Study on the Re-executability of Publicly Shared Scientific Workflows
Author :
Rudolf Mayer;Andreas Rauber
Author_Institution :
SBA Res., Vienna, Austria
fYear :
2015
Firstpage :
312
Lastpage :
321
Abstract :
Workflows have become a popular means for implementing experiments in computational sciences. They are beneficial over other forms of implementation, as they require a formalisation of the experiment process, they provide a standard set of functions to be used, and provide an abstraction of the underlying system. Thus, they facilitate understandability and repeatability of experimental research. Also, additional meta data standards such as Research Objects, which allow to add more meta-data about the research process, shall enable better reproducibility of experiments. However, as several studies have shown, merely implementing an experiment as a workflow in a workflow engine is not sufficient to achieve these goals, as still a number of challenges and pitfalls prevail. In this paper, we want to quantify how many workflow executions are easy to repeat. To this end, we automatically obtain and analyse a set of almost 1,500 workflows available in the myExperiment platform, focusing on the ones authored in the Taverna workflow language. We provide statistics on the types of processing steps used, and investigate what vulnerabilities in regards to re-execution are faced. We then try to automatically execute the workflows. Form these results, we conclude which are the most common causes for failures, and analyse how these can be countered, with existing or yet to be developed approaches.
Keywords :
"Program processors","Engines","Java","Web services","Ports (Computers)","Libraries"
Publisher :
ieee
Conference_Titel :
e-Science (e-Science), 2015 IEEE 11th International Conference on
Type :
conf
DOI :
10.1109/eScience.2015.58
Filename :
7304314
Link To Document :
بازگشت