DocumentCode :
3155548
Title :
The PTC scheme for designing loosely coupled recoverable processes: issues in realizing bounded recovery time
Author :
Kim, K.H.
Author_Institution :
Dept. of Electr. & Comput. Eng., California Univ., Irvine, CA, USA
fYear :
1992
fDate :
14-16 Apr 1992
Firstpage :
287
Lastpage :
296
Abstract :
The technology for designing loosely coupled distributed computer systems (DCSs) required to tolerate propagated errors caused by software and/or hardware has remained in an immature state. This paper focuses on the type of DCS applications where a system is structured as a set of loosely coupled interacting processes distributed among multiple physical sites and each process is designed in the `partitioned design´ mode, i.e. designed with its interface specification only, rather than with full knowledge of interfaces between other processes (or sites). The thesis is that fault tolerance capabilities must be designed into loosely coupled processes without violating the design policy. The programmer-transparent coordination (PTC) scheme is one such approach that has been evolving since 1978. While the basic PTC scheme called the PTC/OR (PTC with obedient receiver) scheme is a scheme for facilitating various forms of cooperative backward recovery in systems of loosely coupled processes, it has one drawback: the difficulty of bounding worst-case recovery time. After discussing various possible solution approaches and their limitations, a promising approach called the PTC/SL (PTC with session leaders) scheme which superimposes additional rules on structuring process interactions onto those of the PTC/OR scheme, is presented. Under the PTC/SL scheme various flexible forms of process interactions are still allowed while the task of ensuring bounded recovery time is made a simple one. Several research issues related to the PTC/SL scheme, e.g., efficient implementation techniques, remain as subjects for future research
Keywords :
distributed processing; fault tolerant computing; system recovery; PTC/SL; bounded recovery time; cooperative backward recovery; fault tolerance; hardware; interface specification; loosely coupled recoverable processes; programmer-transparent coordination; propagated errors; session leaders; software; worst-case recovery time; Application software; Computer errors; Design engineering; Distributed control; Fault detection; Fault tolerance; Fault tolerant systems; Hardware; Process design; Wide area networks;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Distributed Computing Systems, 1992., Proceedings of the Third Workshop on Future Trends of
Conference_Location :
Taipei
Print_ISBN :
0-8186-2755-7
Type :
conf
DOI :
10.1109/FTDCS.1992.217482
Filename :
217482
Link To Document :
بازگشت