Title :
Diskless Checkpointing with Rollback-Dependency Trackability
Author :
Menderico, Raphael Marcos ; Garcia, Islene Calciolari
Author_Institution :
Inst. of Comput., State Univ. of Campinas (UNICAMP), Campinas, Brazil
fDate :
Oct. 31 2010-Nov. 3 2010
Abstract :
One way to implement fault tolerant applications is storing its current state in stable memory and, when a failure occurs, restart the application from the last global consistent state. If the number of simultaneous failures is expected to be small a diskless check pointing approach can be used, where a failed process´s state can be determined only accessing non-faulty process´s memory. In the literature diskless check pointing is usually based on synchronous protocols or properties of the application. In this paper we present a quasi-synchronous diskless check pointing algorithm, called RDT-Diskless, based on Rollback-Dependency Track ability. The proposed algorithm includes a garbage collection approach that limits the number of checkpoints that must be kept in memory. A framework, called Cheops, was developed and experimental results were obtained from a commercial cloud environment.
Keywords :
checkpointing; fault tolerant computing; protocols; storage management; Cheops; RDT-diskless; fault tolerant applications; garbage collection approach; nonfaulty process memory; quasi-synchronous diskless check pointing algorithm; rollback-dependency trackability; synchronous protocols; Checkpointing; Clouds; Fault tolerance; Fault tolerant systems; Protocols; Servers; Synchronization; availability; checkpointing; dependability; distributed algorithms; fault-tolerance;
Conference_Titel :
Reliable Distributed Systems, 2010 29th IEEE Symposium on
Conference_Location :
New Delhi
Print_ISBN :
978-0-7695-4250-8
DOI :
10.1109/SRDS.2010.17