مرکز منطقه ای اطلاع رساني علوم و فناوري - Coordinated checkpointing without direct coordination

DocumentCode :

327106

Title :

Coordinated checkpointing without direct coordination

Author :

Neves, Nuno ; Fuchs, W. Kent

Author_Institution :

Coordinated Sci. Lab., Illinois Univ., Urbana, IL, USA

fYear :

1998

fDate :

7-9 Sep 1998

Firstpage :

Lastpage :

Abstract :

Coordinated checkpointing is a well-known method to achieve fault tolerance in distributed systems. Long running parallel applications and high-availability applications are two potential users of checkpointing, although with different requirements. Parallel applications need low failure-free overheads, and high-availability applications require fast and bounded recoveries. In this paper we describe a new coordinated checkpoint protocol capable of satisfying both types of applications. The protocol uses time to avoid all types of direct coordination (e.g., message exchanges and message tagging), reducing the overheads to almost a minimum. To ensure that rapid recoveries can be attained the protocol guarantees small checkpoint latencies. The protocol was implemented and tested on a cluster of workstations connected by a 155 Mbit/sec ATM. Experimental results show that the protocol overheads are very small

Keywords :

distributed processing; fault tolerant computing; protocols; checkpoint latencies; coordinated checkpoint protocol; coordinated checkpointing; distributed systems; fault tolerance; Application software; Availability; Checkpointing; Contracts; Delay; Electrical capacitance tomography; Identity-based encryption; Protocols; Tagging; Testing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computer Performance and Dependability Symposium, 1998. IPDS '98. Proceedings. IEEE International

Conference_Location :

Durham, NC

ISSN :

1087-2191

Print_ISBN :

0-8186-8679-0

Type :

conf

DOI :

10.1109/IPDS.1998.707706

Filename :

707706

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=327106