Title :
Resilient workflows for cooperative design
Author :
Nguyên, Toàn ; Trifan, Laurentiu ; Désidéri, Jean-Antoine
Author_Institution :
INRIA, St. Ismier, France
Abstract :
This paper describes an approach to extend process modeling for engineering design applications with fault-tolerance and resilience capabilities. It is based on the requirements for application-level error handling, which is a requirement for petascale and exascale scientific computing. This complements the traditional fault-tolerance management features provided by the existing hardware and distributed systems. These are often based on data and operations duplication and migration, and on checkpoint-restart procedures. We show how they can be optimized for high-performance infrastructures. This approach is applied on a prototype tested against industrial testcases for optimization of engineering design artifacts.his electronic document is a “live” template. The various components of your paper [title, text, heads, etc.] are already defined on the style sheet, as illustrated by the portions given in this document.
Keywords :
checkpointing; design engineering; natural sciences computing; software fault tolerance; application level error handling; checkpoint restart procedures; cooperative design; electronic document; engineering design applications; engineering design artifacts; exascale scientific computing; fault tolerance management features; high performance infrastructures; petascale scientific computing; process modeling; resilient workflows; Computational modeling; Fault tolerance; Fault tolerant systems; Hardware; Optimization; Resilience; Software; Workflows; distributed systems; engineering design; fault-tolerance; high-performance computing; process modeling; resilience;
Conference_Titel :
Computer Supported Cooperative Work in Design (CSCWD), 2011 15th International Conference on
Conference_Location :
Lausanne
Print_ISBN :
978-1-4577-0386-7
DOI :
10.1109/CSCWD.2011.5960057