• DocumentCode
    796351
  • Title

    Virtual checkpoints: architecture and performance

  • Author

    Bowen, Nicholas S. ; Pradhan, Dhiraj K.

  • Author_Institution
    IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
  • Volume
    41
  • Issue
    5
  • fYear
    1992
  • fDate
    5/1/1992 12:00:00 AM
  • Firstpage
    516
  • Lastpage
    525
  • Abstract
    Checkpoint and rollback recovery is a technique that allows a system to tolerate a failure by periodically saving the entire state and, if an error is detected, rolling back to the prior checkpoint. A technique that embeds the support for checkpoint and rollback recovery directly into the virtual memory translation hardware is presented. The scheme is general enough to be implemented on various scopes of data such as a portion of an address space, a single address space, or multiple address spaces. The technique can provide a high-performance scheme for implementing checkpoint and rollback recovery. The performance. of the scheme is analyzed using a trace-driven simulation. The overhead is a function of the interval between checkpoints and becomes very small for intervals greater than 106 references. However, the scheme is shown to be feasible for intervals as small as 1000 references under certain conditions
  • Keywords
    fault tolerant computing; performance evaluation; address space; failure tolerance; performance analysis; rollback recovery; trace-driven simulation; virtual checkpoints; virtual memory translation hardware; Analytical models; Application software; Computational modeling; Computer architecture; Counting circuits; Fault detection; Fault tolerance; Hardware; Performance analysis; Very large scale integration;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/12.142677
  • Filename
    142677