Title :
Distributed fault tolerance: lessons from Delta-4
Author_Institution :
Lab. d´´Autom. et d´´Anal. des Syst., CNRS, Toulouse, France
Abstract :
Because they avoid extensive redesign of specialized hardware, software-implemented approaches to fault tolerance are very resilient to change. Europe´s Delta-4 project argues persuasively for implementing fault tolerance in a distributed fashion. The Delta-4 approach achieves fault tolerance by replicating capsules/spl minus/runtime representations of application objects/spl minus/on distributed, LAN-interconnected nodes. It can configure capsule groups to tolerate either stopping or arbitrary failures. Its multipoint protocols serve to coordinate capsule groups and for error processing and fault treatment.<>
Keywords :
LAN interconnection; fault tolerant computing; local area networks; protocols; redundancy; Delta-4 project; LAN-interconnected nodes; application objects; arbitrary failures; capsule groups; capsule replication; distributed fault tolerance; error processing; fault treatment; multipoint protocols; runtime representations; software-implemented approaches; stopping failures; Application software; Communication standards; Computer crashes; Distributed computing; Fault tolerance; Fault tolerant systems; Hardware; Local area networks; Open systems; Redundancy;
Journal_Title :
Micro, IEEE