• DocumentCode
    1451481
  • Title

    A survey of recoverable distributed shared virtual memory systems

  • Author

    Morin, Christine ; Puaut, Isabelle

  • Author_Institution
    Campus Univ. de Beaulieu, INRIA, Rennes, France
  • Volume
    8
  • Issue
    9
  • fYear
    1997
  • fDate
    9/1/1997 12:00:00 AM
  • Firstpage
    959
  • Lastpage
    969
  • Abstract
    Distributed Shared Virtual Memory (DSVM) systems provide a shared memory abstraction on distributed memory architectures. Such systems ease parallel application programming because the shared-memory programming model is often more natural than the message-passing paradigm. However, the probability of failure of a DSVM increases with the number of sites. Thus, fault tolerance mechanisms must be implemented in order to allow processes to continue their execution in the event of a failure. This paper gives an overview of recoverable DSVMs (RDSVMs) that provide a checkpointing mechanism to restart parallel computations in the event of a site failure
  • Keywords
    distributed memory systems; fault tolerant computing; memory architecture; shared memory systems; system recovery; virtual storage; checkpointing mechanism; distributed memory architectures; fault tolerance mechanisms; message-passing paradigm; parallel computations; recoverable distributed shared virtual memory systems; shared memory abstraction; shared-memory programming model; site failure; Bit error rate; Checkpointing; Concurrent computing; Fault tolerance; Hardware; Memory architecture; Parallel programming; Programming profession; Read-write memory; Writing;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/71.615441
  • Filename
    615441