Title :
Discontinuous incremental: A new approach towards extremely lightweight checkpoints
Author :
Ha, Viet Hai ; Renault, Éric
Author_Institution :
Inst. Telecom, Telecom SudParis, Évry, France
Abstract :
Checkpointing is an important method for providing fault tolerance, load balancing, process migration, periodic backup, and many other functions. It is also the basic tool used in CAPE, a paradigm which aims at distributing the execution of a program on a distributed-memory environment. This paper presents a new approach to checkpoint and an original optimization on the checkpoint structure that we have implemented and evaluated to make incremental checkpointing more efficient and more appropriate, especially for CAPE.
Keywords :
checkpointing; distributed memory systems; optimisation; resource allocation; software fault tolerance; CAPE; distributed memory environment; extremely lightweight checkpoints; fault tolerance; load balancing; optimization; periodic backup; process migration; Checkpointing; Kernel; Markov processes; Memory management; Monitoring; Optimization; Writing;
Conference_Titel :
Computer Networks and Distributed Systems (CNDS), 2011 International Symposium on
Conference_Location :
Tehran
Print_ISBN :
978-1-4244-9153-7
DOI :
10.1109/CNDS.2011.5764578