Title :
Agreeing on who is present and who is absent in a synchronous distributed system
Author_Institution :
IBM Almaden Res. Center, San Jose, CA, USA
Abstract :
The author describes his system model and failure assumptions by precisely specifying the processor group membership problem. He then gives two protocols for solving this problem. The protocols provide all correct processors with constituent views of the processor group membership. They also guarantee bounded processor failure detection and join processing delays despite any number of performance failures that do not cause network partitioning. The first protocol provides very fast processor failure detection but can require a significant message traffic overhead, even when no failures occur. To reduce this overhead, the author derives the second protocol, which has a (provable) minimal message overhead in the absence of failures but provides a longer failure detection delay and is more complex. He concludes by comparing his approach with other known approaches.<>
Keywords :
distributed processing; protocols; bounded processor failure detection; membership problem; message traffic overhead; protocols; synchronous distributed system; system model; Clocks; Computer networks; Delay; Fault tolerant systems; Hardware; Network servers; Operating systems; Protocols; Telecommunication traffic; Traffic control;
Conference_Titel :
Fault-Tolerant Computing, 1988. FTCS-18, Digest of Papers., Eighteenth International Symposium on
Conference_Location :
Tokyo, Japan
Print_ISBN :
0-8186-0867-6
DOI :
10.1109/FTCS.1988.5321