DocumentCode
1853421
Title
Accelerated heartbeat protocols
Author
Gouda, Mohamed G. ; McGuire, Tommy M.
Author_Institution
Dept. of Comput. Sci., Texas Univ., Austin, TX, USA
fYear
1998
fDate
26-29 May 1998
Firstpage
202
Lastpage
209
Abstract
Heartbeat protocols are used by distributed programs to ensure that if a process in a program terminates or fails, then the remaining processes in the program terminate. We present a class of heartbeat protocols that tolerate message loss. In these protocols, a root process periodically sends a beat message to every other process then waits to receive a reply beat message from every other process. If the root process does not receive a reply (possibly due to message loss), the root process reduces by half the period for sending beat messages. We show that in practical situations, the parameters of these protocols can be chosen to achieve a good compromise between three contradictory objectives: reduce the rate of sending beat messages, reduce the detection delay, and still keep the probability of premature termination small
Keywords
computer network reliability; message passing; protocols; software fault tolerance; beat message; detection delay; distributed programs; fault tolerance; heartbeat protocols; message loss; process termination; program termination; Acceleration; Computer networks; Delay; Detection algorithms; Fault detection; Heart beat; Heart rate detection; Protocols; Read only memory;
fLanguage
English
Publisher
ieee
Conference_Titel
Distributed Computing Systems, 1998. Proceedings. 18th International Conference on
Conference_Location
Amsterdam
ISSN
1063-6927
Print_ISBN
0-8186-8292-2
Type
conf
DOI
10.1109/ICDCS.1998.679503
Filename
679503
Link To Document