Title :
Fault tolerant distributed information systems
Author :
Knight, John C. ; Elder, Matthew C.
Author_Institution :
Dept. of Comput. Sci., Virginia Univ., Charlottesville, VA, USA
Abstract :
Critical infrastructures provide services upon which society depends heavily; these applications are themselves dependent on distributed information systems for all aspects of their operation and so survivability of the information systems is an important issue. Fault tolerance is a mechanism by which survivability can be achieved in these information systems. We outline a specification-based approach to fault tolerance, called RAPTOR, that enables structuring of fault-tolerance specifications and an implementation partially, synthesized from the formal specification. The RAPTOR approach uses three specifications describing the fault-tolerant system, the errors to be detected, and the actions to take to recover from those errors. System specification utilizes an object-oriented database to store the descriptions associated with these large, complex systems. The error detection and recovery specifications are defined using the formal specification notation Z. We also describe an implementation architecture and explore our solution with a case study.
Keywords :
distributed processing; formal specification; information systems; software fault tolerance; RAPTOR; distributed information systems; fault tolerance; formal specification; object-oriented database; specification-based; survivability; Application software; Computer architecture; Computer errors; Distributed information systems; Fault detection; Fault tolerance; Fault tolerant systems; Formal specifications; Monitoring; Redundancy;
Conference_Titel :
Software Reliability Engineering, 2001. ISSRE 2001. Proceedings. 12th International Symposium on
Print_ISBN :
0-7695-1306-9
DOI :
10.1109/ISSRE.2001.989466