Title :
Flexible, Cost-EffectiveMembership Agreement in Synchronous Systems
Author :
Barbosa, Raul ; Karlsson, Johan
Author_Institution :
Dept. of Comput. Sci. & Eng., Chalmers Univ. of Technol., Goteborg
Abstract :
This paper presents a processor group membership protocol for fault-tolerant distributed real-time systems that utilize periodic, time-triggered scheduling for sending messages over the system´s communication network. The protocol allows fault-free nodes to reach agreement on the operational state of all nodes in the presence of fail-silent or fail-reporting node failures as well as network failures (lost or corrupted messages). The protocol is based on the principle that each message sent by a node in the membership is acknowledged by k other nodes in a system of n nodes, where k can be set to any number between 2 and n - 1. Agreement on node failure (membership departure) and agreement on node recovery (membership reintegration) are handled by two different mechanisms. Agreement on departure is guaranteed if no more than f = k - 1 failures occur in the same communication round, while at most one node can be reintegrated into the membership per communication round
Keywords :
computer network reliability; fault tolerant computing; message passing; protocols; real-time systems; scheduling; communication network; fail-reporting node failure; fail-silent node failure; fault-tolerant distributed real-time system; flexible cost-effective membership agreement; message passing; network failure; periodic time-triggered scheduling; processor group membership protocol; synchronous system; Access protocols; Bandwidth; Communication networks; Computer science; Context; Costs; Fault tolerance; Fault tolerant systems; Processor scheduling; Real time systems;
Conference_Titel :
Dependable Computing, 2006. PRDC '06. 12th Pacific Rim International Symposium on
Conference_Location :
Riverside, CA
Print_ISBN :
0-7695-2724-8
DOI :
10.1109/PRDC.2006.36