Title :
An Efficient and Reliable Scientific Workflow System
Author :
Tavares, Thiago ; Teodoro, George ; Kurc, Tahsin ; Ferreira, Ricardo ; Guedes, Dorgival ; Meira, Wagner ; Catalyurek, Umit
Author_Institution :
Dept. of Comput. Sci., Univ. Fed. de Minas Gerais, Belo Horizonte
Abstract :
This paper presents a fault tolerance framework for applications that process data using a distributed network of user-defined operations in a pipelined fashion. The framework saves intermediate results and messages exchanged among application components in a distributed data management system to facilitate quick recovery from failures. The experimental results show that the framework scales well and our approach introduces very little overhead to application execution.
Keywords :
middleware; natural sciences computing; software fault tolerance; system recovery; workflow management software; distributed data management system; distributed network; fault tolerance; middleware; scientific workflow system; user-defined operations; Biomedical computing; Biomedical informatics; Computer networks; Data analysis; Data processing; Distributed computing; Fault tolerance; Fault tolerant systems; Middleware; Protocols;
Conference_Titel :
Cluster Computing and the Grid, 2007. CCGRID 2007. Seventh IEEE International Symposium on
Conference_Location :
Rio De Janeiro
Print_ISBN :
0-7695-2833-3
DOI :
10.1109/CCGRID.2007.20