DocumentCode :
2737900
Title :
Portable checkpointing and recovery
Author :
Silva, Luis M. ; Silva, João G. ; Chapple, Simon ; Clarke, Lyndon
Author_Institution :
Dept. de Engenharia Inf., Coimbra Univ., Portugal
fYear :
1995
fDate :
2-4 Aug 1995
Firstpage :
188
Lastpage :
195
Abstract :
This paper presents a checkpointing scheme that was implemented in a parallel library that runs on top of CHIMP/MPI. The main goals of the checkpointing mechanism are portability and efficiency. It runs on every platform supported by MPI in a machine-independent way. The scheme allows the migration of checkpoints and offers a flexible recovery mechanism based on data-reconfiguration. Some performance results will be presented at the end of the paper together with some techniques that can be used to increase the efficiency of the checkpointing mechanism
Keywords :
operating systems (computers); parallel machines; software portability; system recovery; data-reconfiguration; f CHIMP/MPI; flexible recovery mechanism; parallel library; portability; portable checkpointing; recovery; Checkpointing; Computer crashes; Distributed computing; Guidelines; Libraries; Operating systems; Parallel machines; Parallel processing; Proposals; Workstations;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Distributed Computing, 1995., Proceedings of the Fourth IEEE International Symposium on
Conference_Location :
Washington, DC
ISSN :
1082-8907
Print_ISBN :
0-8186-7088-6
Type :
conf
DOI :
10.1109/HPDC.1995.518709
Filename :
518709
Link To Document :
بازگشت