DocumentCode :
1435416
Title :
Improving the Robustness of Distributed Failure Detectors in Adverse Conditions
Author :
Lemos, F.T.C. ; Sato, L.M.
Author_Institution :
Univ. de Sao Paulo (USP), Sao Paulo, Brazil
Volume :
10
Issue :
1
fYear :
2012
Firstpage :
1364
Lastpage :
1369
Abstract :
Failure detection is at the core of most fault tolerance strategies, but it often depends on reliable communication. We present new algorithms for failure detectors which are appropriate as components of a fault tolerance system that can be deployed in situations of adverse network conditions (such as loosely connected and administered computing grids). It packs redundancy into heartbeat messages, thereby improving on the robustness of the traditional protocols. Results from experimental tests conducted in a simulated environment with adverse network conditions show significant improvement over existing solutions.
Keywords :
protocols; telecommunication network reliability; adverse network conditions; distributed failure detectors; fault tolerance strategies; heartbeat messages; protocols; reliable communication; Biomedical monitoring; Detectors; Fault tolerance; Heart beat; Monitoring; Payloads; Robustness; Distributed Failure Detectors; Failure Detection; Fault Tolerance;
fLanguage :
English
Journal_Title :
Latin America Transactions, IEEE (Revista IEEE America Latina)
Publisher :
ieee
ISSN :
1548-0992
Type :
jour
DOI :
10.1109/TLA.2012.6142485
Filename :
6142485
Link To Document :
بازگشت