DocumentCode :
2747969
Title :
The performance of coordinated and independent checkpointing
Author :
Silva, Luis Moura ; Silva, João Gabriel
Author_Institution :
Dept. de Engenharia Inf., Coimbra Univ., Portugal
fYear :
1999
fDate :
12-16 Apr 1999
Firstpage :
280
Lastpage :
284
Abstract :
Checkpointing is a very effective technique to tolerate the occurrence of failures in distributed and parallel applications. The existing algorithms in the literature are basically divided into two main classes: coordinated and independent checkpointing. This paper presents an experimental study that compares the performance of these two classes of algorithms. The main conclusion of our study is that coordinated checkpointing is more efficient than independent checkpointing and all the arguments against the performance of coordinated algorithms were not verified in practice
Keywords :
fault tolerant computing; performance evaluation; system recovery; checkpointing; coordinated checkpointing; distributed; fault tolerance; independent checkpointing; parallel; Bandwidth; Checkpointing; Electrical capacitance tomography; Parallel machines; Protocols; Runtime library; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing, 1999. 13th International and 10th Symposium on Parallel and Distributed Processing, 1999. 1999 IPPS/SPDP. Proceedings
Conference_Location :
San Juan
Print_ISBN :
0-7695-0143-5
Type :
conf
DOI :
10.1109/IPPS.1999.760487
Filename :
760487
Link To Document :
بازگشت