DocumentCode :
2027431
Title :
The complexity of adding failsafe fault-tolerance
Author :
Kulkarni, Sandeep S. ; Ebnenasir, Ali
Author_Institution :
Dept. of Comput. Sci. & Eng., Michigan State Univ., East Lansing, MI, USA
fYear :
2002
fDate :
2002
Firstpage :
337
Lastpage :
344
Abstract :
In this paper, we focus our attention on the problem of automating the addition of failsafe fault-tolerance where fault-tolerance is added to an existing (fault-intolerant) program. A failsafe fault-tolerant program satisfies its specification (including safety and liveness) in the absence of faults. And, in the presence of faults, it satisfies its safety specification. We present a somewhat unexpected result that, in general, the problem of adding failsafe fault-tolerance in distributed programs is NP-hard. Towards this end, we reduce the 3-SAT problem to the problem of adding failsafe fault-tolerance. We also identify a class of specifications, monotonic specifications and a class of programs, monotonic programs. Given a (positive) monotonic specification and a (negative) monotonic program, we show that failsafe fault-tolerance can be added in polynomial time. We note that the monotonicity restrictions are met for commonly encountered problems such as Byzantine agreement, distributed consensus, and atomic commitment. Finally, we argue that the restrictions on the specifications and programs are necessary to add failsafe fault-tolerance in polynomial time; we prove that if only one of these conditions is satisfied, the addition of failsafe fault-tolerance is still NP-hard.
Keywords :
computational complexity; distributed programming; formal specification; software fault tolerance; 3-SAT problem; Byzantine agreement; NP-hard problem; atomic commitment; automation; complexity; distributed consensus; distributed programs; failsafe fault tolerance addition; monotonic programs; monotonic specifications; polynomial time; safety specification; Algorithm design and analysis; Automation; Computer science; Contracts; Engineering profession; Fault diagnosis; Fault tolerance; Fault tolerant systems; Polynomials; Safety;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Distributed Computing Systems, 2002. Proceedings. 22nd International Conference on
ISSN :
1063-6927
Print_ISBN :
0-7695-1585-1
Type :
conf
DOI :
10.1109/ICDCS.2002.1022271
Filename :
1022271
Link To Document :
بازگشت