DocumentCode :
1451481
Title :
A survey of recoverable distributed shared virtual memory systems
Author :
Morin, Christine ; Puaut, Isabelle
Author_Institution :
Campus Univ. de Beaulieu, INRIA, Rennes, France
Volume :
8
Issue :
9
fYear :
1997
fDate :
9/1/1997 12:00:00 AM
Firstpage :
959
Lastpage :
969
Abstract :
Distributed Shared Virtual Memory (DSVM) systems provide a shared memory abstraction on distributed memory architectures. Such systems ease parallel application programming because the shared-memory programming model is often more natural than the message-passing paradigm. However, the probability of failure of a DSVM increases with the number of sites. Thus, fault tolerance mechanisms must be implemented in order to allow processes to continue their execution in the event of a failure. This paper gives an overview of recoverable DSVMs (RDSVMs) that provide a checkpointing mechanism to restart parallel computations in the event of a site failure
Keywords :
distributed memory systems; fault tolerant computing; memory architecture; shared memory systems; system recovery; virtual storage; checkpointing mechanism; distributed memory architectures; fault tolerance mechanisms; message-passing paradigm; parallel computations; recoverable distributed shared virtual memory systems; shared memory abstraction; shared-memory programming model; site failure; Bit error rate; Checkpointing; Concurrent computing; Fault tolerance; Hardware; Memory architecture; Parallel programming; Programming profession; Read-write memory; Writing;
fLanguage :
English
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9219
Type :
jour
DOI :
10.1109/71.615441
Filename :
615441
Link To Document :
بازگشت