DocumentCode
3744809
Title
System-level versus user-defined checkpointing
Author
L.M. Silva;J.G. Silva
Author_Institution
Dept. Engenharia Inf., Coimbra Univ., Portugal
fYear
1998
Firstpage
68
Lastpage
74
Abstract
Checkpointing and rollback recovery is a very effective technique to tolerate transient faults and preventive shutdowns. In the past, most of the checkpointing schemes published in the literature were supposed to be transparent to the application programmer and implemented at the operating-system level. In recent years, there has been some work on higher-level forms of checkpointing. In this second approach, the user is responsible for the checkpoint placement and is required to specify the checkpoint contents. We compare the two approaches: system-level and user-defined checkpointing. We discuss the pros and cons of both approaches and we present an experimental study that was conducted on a commercial parallel machine.
Keywords
"Checkpointing","Programming profession","Operating systems","Program processors","Fault tolerance","Runtime library","Fault tolerant systems","Communication channels"
Publisher
ieee
Conference_Titel
Reliable Distributed Systems, 1998. Proceedings. Seventeenth IEEE Symposium on
ISSN
1060-9857
Print_ISBN
0-8186-9218-9
Type
conf
DOI
10.1109/RELDIS.1998.740476
Filename
740476
Link To Document