Title :
Service-Oriented Reliable Problem Solving Environment for Scientific Computation
Author :
Liu, Cancan ; Zhang, Weimin ; Luo, Zhigang ; Liu, Hai ; Xiao, Lin
Author_Institution :
Sch. of Comput. Sci., Nat. Univ. of Defense Technol., Changsha
Abstract :
Due to the large-scale and long running of scientific computation under the dynamic and unsteady grid architecture, the capability of fault-tolerance of scientific workflow management system becomes more and more important. In order to handle inevitable failures of activities in workflow, we present a three-level recovery strategy in this paper: in the service level, we provide a distributed Service Agent (SA) for each activity to monitor the execution status of workflow activities and implement the retry-based recovery strategy by submitting the failed activity multiple times; then in the workflow level, workflow engine implements replication-based strategy by request the Service Factory (SF) to create another service instance on a different node and invoke the new service instance for replacement; while in the user level, we provide a user interface for the users to handle the failure on demand. At last, a reliable Problem Solving Environment (PSE) in climate domain called Ensemble Prediction Scientific Workflow (EPSWFlow) is presented. This approach can seamlessly embed the complex control-flow intensive recovery strategies within the dataflow process network. Moreover, it can enable the prediction process more robust and more reusable.
Keywords :
fault tolerant computing; workflow management software; Ensemble Prediction Scientific Workflow; Problem Solving Environment; distributed Service Agent; scientific workflow management system; service-oriented reliable problem solving environment; unsteady grid architecture; workflow engine; Computer architecture; Condition monitoring; Engines; Fault tolerant systems; Grid computing; Large-scale systems; Problem-solving; Production facilities; User interfaces; Workflow management software;
Conference_Titel :
Asia-Pacific Services Computing Conference, 2008. APSCC '08. IEEE
Conference_Location :
Yilan
Print_ISBN :
978-0-7695-3473-2
Electronic_ISBN :
978-0-7695-3473-2
DOI :
10.1109/APSCC.2008.118