Title :
An extended home-based coherence protocol for causally consistent replicated read-write objects
Author :
Brzezinski, Jerzy ; Szychowiak, Michal
Author_Institution :
Inst. of Comput. Sci., Poznan Univ. of Technol., Poland
Abstract :
This paper considers the reliability of software Distributed Shared Memory systems where the unit of sharing is a persistent read-write object. We present art extended coherence protocol for causal consistency model, which integrates replication management with independent checkpointing. It uses a trove! coordinated burst checkpoint operation in order to replicate consistent checkpoints of shared objects in local memory of distinct system nodes. No special reliable hardware devices are required. The protocol offers high availability of shared objects with limited overhead and ensures fast recovery in case of multiple node failures. lit case of the network partitioning all the processes in a majority partition of the system can continuously access all the objects.
Keywords :
distributed shared memory systems; protocols; system recovery; causal consistency model; coordinated burst checkpoint operation; extended home-based coherence protocol; network partitioning; persistent read-write object; replication management; software distributed shared memory systems; Access protocols; Availability; Checkpointing; Clustering algorithms; Coherence; Fault tolerant systems; Hardware; Memory management; Object oriented modeling; Read-write memory;
Conference_Titel :
Cluster Computing and the Grid, 2003. Proceedings. CCGrid 2003. 3rd IEEE/ACM International Symposium on
Print_ISBN :
0-7695-1919-9
DOI :
10.1109/CCGRID.2003.1199408