Title :
A Study on the Method of Adaptive Time Intervals Checkpointing
Author :
Qianqian Wu ; Bin Li ; Shuaijun Chen ; Zhenzhou Ji
Author_Institution :
Dept. of Comput. Sci. & Eng., Harbin Inst. of Technol. at Weihai, Weihai, China
Abstract :
Setting up checkpoints is an effective and important means of fault-tolerant computer system for fault recovery, while the checkpoint overhead has a great influence on the system performance. This paper designs and implements the basic checkpoint information preservation and rollback recovery function. Furthermore, when the system load becomes heavier, continuing adopting fixed time intervals check pointing will bring too much checkpoint overhead to the target process. For this problem, a method of adaptive time intervals check pointing has been proposed. While the system load is increasing, this method can adjust the time intervals according to the current system load in adaptive way and reduce the number of checkpoints appropriately. Thus, the extra time overhead caused by setting up checkpoints will be reduced, and the performance of the fault-tolerant system will be improved. Finally, experimental results show that the proposed method can reduce the time overhead compared with fixed time intervals check pointing.
Keywords :
checkpointing; fault tolerant computing; adaptive time intervals checkpointing; checkpoint information preservation; fault recovery; fault-tolerant computer system; rollback recovery function; system performance; Adaptive systems; Benchmark testing; Checkpointing; Context; Fault tolerance; Linux; Registers; Adaptive; Checkpoint Intervals; System Load; Time Overhead;
Conference_Titel :
Instrumentation and Measurement, Computer, Communication and Control (IMCCC), 2014 Fourth International Conference on
Conference_Location :
Harbin
Print_ISBN :
978-1-4799-6574-8
DOI :
10.1109/IMCCC.2014.96