• DocumentCode
    1853421
  • Title

    Accelerated heartbeat protocols

  • Author

    Gouda, Mohamed G. ; McGuire, Tommy M.

  • Author_Institution
    Dept. of Comput. Sci., Texas Univ., Austin, TX, USA
  • fYear
    1998
  • fDate
    26-29 May 1998
  • Firstpage
    202
  • Lastpage
    209
  • Abstract
    Heartbeat protocols are used by distributed programs to ensure that if a process in a program terminates or fails, then the remaining processes in the program terminate. We present a class of heartbeat protocols that tolerate message loss. In these protocols, a root process periodically sends a beat message to every other process then waits to receive a reply beat message from every other process. If the root process does not receive a reply (possibly due to message loss), the root process reduces by half the period for sending beat messages. We show that in practical situations, the parameters of these protocols can be chosen to achieve a good compromise between three contradictory objectives: reduce the rate of sending beat messages, reduce the detection delay, and still keep the probability of premature termination small
  • Keywords
    computer network reliability; message passing; protocols; software fault tolerance; beat message; detection delay; distributed programs; fault tolerance; heartbeat protocols; message loss; process termination; program termination; Acceleration; Computer networks; Delay; Detection algorithms; Fault detection; Heart beat; Heart rate detection; Protocols; Read only memory;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Distributed Computing Systems, 1998. Proceedings. 18th International Conference on
  • Conference_Location
    Amsterdam
  • ISSN
    1063-6927
  • Print_ISBN
    0-8186-8292-2
  • Type

    conf

  • DOI
    10.1109/ICDCS.1998.679503
  • Filename
    679503