Title :
A new way of calculating the recovery line through eliminating useless checkpoints in distributed systems
Author :
Pourmahmoud, Solmaz ; Asbaghi, Shabnam ; Haghighat, Abolfazl T.
Author_Institution :
Technol. & Eng. Dept., Islamic azad Univ. Khoy Branch, Khoy
Abstract :
Uncoordinated checkpointing protocol is a simple protocol used in many distributed systems for fault tolerance. In this paper, we discuss on the size of rollback it has in the presence of failures. In order to determining the recovery line in checkpoint-based recovery, we first study to common approaches: dependency graph and checkpoint graph and provide some algorithms for these approaches. Then we introduce a new approach for calculating the recovery line and making a graph (independent graph). Finally we present a solution for reducing the cost of graph when calculating the recovery line, particularly when the domino effect is occurred.
Keywords :
checkpointing; distributed processing; fault tolerant computing; graph theory; checkpoint graph; dependency graph; distributed systems; fault tolerance; recovery line; uncoordinated checkpointing protocol; useless checkpoint elimination; Access protocols; Checkpointing; Communication channels; Costs; Distributed computing; Fault tolerant systems; Hardware; Message passing; Recovery line; domino effect; uncoordinated checkpointing; z-cycle; z-path;
Conference_Titel :
Computer and Information Sciences, 2008. ISCIS '08. 23rd International Symposium on
Conference_Location :
Istanbul
Print_ISBN :
978-1-4244-2880-9
Electronic_ISBN :
978-1-4244-2881-6
DOI :
10.1109/ISCIS.2008.4717867