Title :
Simulating fail-stop in asynchronous distributed systems
Author :
Sabel, Laura S. ; Marzullo, Keith
Author_Institution :
Dept. of Comput. Sci., Cornell Univ., Ithaca, NY, USA
Abstract :
The fail-stop failure model appears frequently in the distributed systems literature. However, in an asynchronous distributed system, the fail-stop model cannot be implemented. In particular, it is impossible to reliably detect crash failures in an asynchronous system. In this paper, we show that it is possible to specify and implement a failure model that is indistinguishable from the fail-stop model from the point of view of any process within an asynchronous system. We give necessary conditions for a failure model to be indistinguishable from the fail-stop model, and derive lower bounds on the amount of process replication needed to implement such a failure model. We present a simple one-round protocol for implementing one such failure model, which we call simulated fail-stop
Keywords :
distributed processing; fault tolerant computing; system recovery; asynchronous distributed systems; crash failures; fail-stop failure model; process replication; Computational modeling; Computer crashes; Computer science; Context modeling; Distributed algorithms; Lead; NASA; Nominations and elections; Protocols; Scholarships;
Conference_Titel :
Reliable Distributed Systems, 1994. Proceedings., 13th Symposium on
Conference_Location :
Dana Point, CA
Print_ISBN :
0-8186-6575-0
DOI :
10.1109/RELDIS.1994.336901