Title :
An improved scheme of index-based checkpointing
Author :
Luo, Yuan-sheng ; Min, Yinghua ; Zhang, Dafang
Author_Institution :
Hunan Univ., China
Abstract :
To provide efficient rollback-recovery for fault-tolerance in distributed systems, it is significant to reduce the number of checkpoints under the existence of consistent global checkpoints in index-based distributed checkpointing algorithms. A new checkpointing scheme, IBQSC, is presented in this paper for index-based checkpointing. It reduces the number of forced-checkpoints when multiple processes transfer data almost equally frequently. It also keeps synchronous in case of some process with less opportunity to transfer data for them to avoid too much amount of overhead of rollback-recovery due to useful computation losing in case of failure. Simulation results show that the proposed IBQSC scheme can reduce the number of induced forced-checkpoints per message 25-30% on an average comparing to the traditional strategies.
Keywords :
checkpointing; distributed algorithms; fault tolerant computing; indexing; fault-tolerance; index-based distributed checkpointing algorithms; rollback-recovery; Algorithm design and analysis; Checkpointing; Clocks; Computational modeling; Computers; Distributed computing; Fault tolerant systems; Force control; Protocols; Synchronization; Active-synchronous; Checkpoint; Distributed systems; Domino-effect; Index;
Conference_Titel :
Dependable Computing, 2005. Proceedings. 11th Pacific Rim International Symposium on
Print_ISBN :
0-7695-2492-3
DOI :
10.1109/PRDC.2005.17