Title :
Enhanced distributed recovery blocks: a unified approach for the design of safety-critical distributed systems
Author :
Elphick, J.R. ; Patton, R.J. ; Tyrrell, A.M.
Author_Institution :
Dept. of Electron., York Univ., Heslington, UK
fDate :
10/21/1993 12:00:00 AM
Abstract :
A novel method is given for dealing with both hardware and software faults in a distributed system and the authors illustrate how the method copes with communication failures between the interacting distributed processors. The work has been designed using the Occam programming language and implemented on a network of transputers. This work is being extended to more complex control applications and shows very good results. The mechanism used is based on distributed recovery blocks (K.H. Kim, H.O. Welch, 1989). It is argued that distributed recovery blocks (DRB) are well suited for real-time control applications. DRB are based on the standard method of recovery blocks. The enhancements incorporated within DRB include the concurrent execution of the try blocks over a distributed network of processing nodes and the dynamic reconfiguration of nodal operations in the event of a fault. The systems proposed, takes the basic DRB and introduces extra acceptance tests to reduce the changes of Byzantine type errors and is termed an Enhanced DRB (EDRB)
Keywords :
computerised control; distributed processing; fault tolerant computing; parallel programming; safety; transputer systems; Byzantine type errors; Enhanced DRB; Occam programming language; acceptance tests; communication failures; complex control applications; concurrent execution; distributed network; distributed recovery blocks; distributed system; dynamic reconfiguration; interacting distributed processors; nodal operations; processing nodes; real-time control applications; safety-critical distributed systems; software faults; standard method; transputers; try blocks;
Conference_Titel :
Safety Critical Distributed Systems, IEE Colloquium on
Conference_Location :
London