Title :
Rigorous development of an embedded fault-tolerant system based on coordinated atomic actions
Author :
Xu, Jie ; Randell, Brian ; Romanovsky, Alexander ; Stroud, Robert J. ; Zorzo, Avelino F. ; Canver, Ercument ; Von Henke, Friedrich
Author_Institution :
Durham Univ., UK
fDate :
2/1/2002 12:00:00 AM
Abstract :
Describes our experience using coordinated atomic (CA) actions as a system structuring tool to design and validate a sophisticated and embedded control system for a complex industrial application that has high reliability and safety requirements. Our study is based on an extended production cell model, the specification and simulator for which were defined and developed by FZI (Forschungszentrum Informatik, Germany). This "fault-tolerant production cell" represents a manufacturing process involving redundant mechanical devices (provided in order to enable continued production in the presence of machine faults). The challenge posed by the model specification is to design a control system that maintains specified safety and liveness properties even in the presence of a large number and variety of device and sensor failures. Based on an analysis of such failures, we provide details of: (1) a design for a control program that uses CA actions to deal with both safety-related and fault tolerance concerns and (2) the formal verification of this design based on the use of model checking. We found that CA action structuring facilitated both the design and verification tasks by enabling the various safety problems (involving possible clashes of moving machinery) to be treated independently. Even complex situations involving the concurrent occurrence of any pairs of the many possible mechanical and sensor failures can be handled simply yet appropriately. The formal verification activity was performed in parallel with the design activity, and the interaction between them resulted in a combined exercise in "design for validation"; formal verification was very valuable in identifying some very subtle residual bugs in early versions of our design which would have been difficult to detect otherwise
Keywords :
embedded systems; fault tolerant computing; flexible manufacturing systems; formal verification; multiprocessing systems; redundancy; safety; complex industrial application; concurrency; continued production; control program design; coordinated atomic actions; design for validation; device failures; embedded control system; embedded fault-tolerant system development; exception handling; fault-tolerant production cell; formal verification; liveness properties; machine faults; manufacturing process; model checking; model specification; moving machinery clashes; object orientation; redundant mechanical devices; reliability requirements; safety requirements; sensor failures; simulator; subtle residual bugs; system structuring tool; Control systems; Electrical equipment industry; Fault tolerance; Fault tolerant systems; Formal verification; Industrial control; Manufacturing industries; Mechanical sensors; Production; Safety;
Journal_Title :
Computers, IEEE Transactions on