Title :
Implementation and Performance Evaluation of an Adaptable Failure Detector for Distributed System
Author :
Zhou, Jingli ; Yang, Guang ; Dong, Lijun ; Liu, Gang
Abstract :
Unreliable failure detectors have been an important abstraction to build dependable distributed applications over asynchronous distributed systems subject to faults. Their implementations are commonly based on timeouts to ensure algorithm termination. However, for systems built on the Internet, it is hard to estimate this time value due to traffic variations. In order to increase the performance, self-tuned failure detectors dynamically adapt their timeouts to the communication delay behavior added of a safety margin. In this paper, we propose a new implementation of a failure detector. This implementation is a variant of the heartbeat failure detector which is adaptable and can support scalable applications. In this implementation we dissociate two aspects: a basic estimation of the expected arrival date to provide a short detection time, and an adaptation of the quality of service. The latter is based on two principles: an adaptation layer and a heuristic to adapt the sending period of "I am alive" messages.
Keywords :
Application software; Computational intelligence; Computer crashes; Computer security; Delay; Detectors; Event detection; Fault detection; Quality of service; Time measurement;
Conference_Titel :
Computational Intelligence and Security, 2007 International Conference on
Conference_Location :
Harbin
Print_ISBN :
0-7695-3072-9
Electronic_ISBN :
978-0-7695-3072-7
DOI :
10.1109/CIS.2007.61