DocumentCode :
1378493
Title :
Hardware-supported asynchronous checkpointing scheme
Author :
Chiu, J.-F. ; Chiu, G.-M.
Author_Institution :
Dept. of Electr. Eng. & Technol., Nat. Taiwan Univ. of Sci. & Technol., Taipei, Taiwan
Volume :
145
Issue :
2
fYear :
1998
fDate :
3/1/1998 12:00:00 AM
Firstpage :
109
Lastpage :
115
Abstract :
The authors propose a hardware-supported scheme to facilitate fast checkpointing and failure recovery operations. The mechanism uses a small-sized bank of nonvolatile memory to save an incremental checkpoint for a processor so that the time overhead incurred by checkpointing can be reduced. Parity technique is employed to compress checkpointing information. An important feature of our scheme is that the checkpointing operation is dissociated from the parity update action. As a result, checkpointing latency is not affected by the speed of parity update activities, and thus is reduced. Moreover. It does not require atomic action for updating the parity data. Furthermore, our scheme allows each processor to initiate a checkpoint independently of others. Experimental results show that the overhead of our mechanism is small, and is not sensitive to the number of checkpoints taken by the processors. This observation suggests that the proposed hardware-supported scheme is promising for improving the performance of checkpoint/rollback-recovery systems
Keywords :
fault tolerant computing; multiprocessing systems; parallel architectures; system recovery; asynchronous checkpointing; checkpointing; failure recovery; fault tolerance; multicomputer system; nonvolatile memory; rollback-recovery;
fLanguage :
English
Journal_Title :
Computers and Digital Techniques, IEE Proceedings -
Publisher :
iet
ISSN :
1350-2387
Type :
jour
DOI :
10.1049/ip-cdt:19981908
Filename :
674989
Link To Document :
بازگشت