• DocumentCode
    3744809
  • Title

    System-level versus user-defined checkpointing

  • Author

    L.M. Silva;J.G. Silva

  • Author_Institution
    Dept. Engenharia Inf., Coimbra Univ., Portugal
  • fYear
    1998
  • Firstpage
    68
  • Lastpage
    74
  • Abstract
    Checkpointing and rollback recovery is a very effective technique to tolerate transient faults and preventive shutdowns. In the past, most of the checkpointing schemes published in the literature were supposed to be transparent to the application programmer and implemented at the operating-system level. In recent years, there has been some work on higher-level forms of checkpointing. In this second approach, the user is responsible for the checkpoint placement and is required to specify the checkpoint contents. We compare the two approaches: system-level and user-defined checkpointing. We discuss the pros and cons of both approaches and we present an experimental study that was conducted on a commercial parallel machine.
  • Keywords
    "Checkpointing","Programming profession","Operating systems","Program processors","Fault tolerance","Runtime library","Fault tolerant systems","Communication channels"
  • Publisher
    ieee
  • Conference_Titel
    Reliable Distributed Systems, 1998. Proceedings. Seventeenth IEEE Symposium on
  • ISSN
    1060-9857
  • Print_ISBN
    0-8186-9218-9
  • Type

    conf

  • DOI
    10.1109/RELDIS.1998.740476
  • Filename
    740476