• DocumentCode
    775119
  • Title

    Checkpoint space reclamation for uncoordinated checkpointing in message-passing systems

  • Author

    Wang, Yi-Min ; Chung, Pi-Yu ; Lin, In-Jen ; Fuchs, W. Kent

  • Author_Institution
    Coordinated Sci. Lab., Illinois Univ., Urbana, IL, USA
  • Volume
    6
  • Issue
    5
  • fYear
    1995
  • fDate
    5/1/1995 12:00:00 AM
  • Firstpage
    546
  • Lastpage
    554
  • Abstract
    Uncoordinated checkpointing allows process autonomy and general nondeterministic execution, but suffers from potential domino effects and the associated space overhead. Previous to this research, checkpoint space reclamation had been based on the notion of obsolete checkpoints; as a result, a potentially unbounded number of nonobsolete checkpoints may have to be retained on stable storage. In this paper, we derive a necessary and sufficient condition for identifying all garbage checkpoints. By using the approach of recovery line transformation and decomposition, we develop an optimal checkpoint space reclamation algorithm and show that the space overhead for uncoordinated checkpointing is in fact bounded by N(N+1)/2 checkpoints where N is the number of processes
  • Keywords
    fault tolerant computing; message passing; storage management; checkpoint space reclamation; fault tolerance; garbage checkpoints; garbage collection; message-passing systems; rollback recovery; space reclamation; stable storage; uncoordinated checkpointing; Checkpointing; Hardware; Laboratories; Mathematics; Military computing; NASA; Protocols; Runtime; Software systems; Sufficient conditions;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/71.382324
  • Filename
    382324