DocumentCode :
1941872
Title :
Units of computation in fault-tolerant distributed systems
Author :
Ahuja, Mohan ; Mishra, Shivakant
Author_Institution :
Dept. of Comput. Sci. & Eng., California Univ., San Diego, La Jolla, CA, USA
fYear :
1994
fDate :
21-24 Jun 1994
Firstpage :
626
Lastpage :
633
Abstract :
We develop a framework that helps in developing understanding of a fault-tolerant distributed system and so helps in designing such systems. We define a unit of computation in such systems, referred to as a molecule, that has a well defined interface with other molecules, i.e. has minimal dependence on other molecules. The smallest such unit-an indivisible molecule-is termed as an atom. We show that any execution of a fault-tolerant distributed computation can be seen as an execution of molecules/atoms in a partial order, and such a view provides insights into understanding the computation, particularly for a fault tolerant system where it is important to guarantee that a unit of computation is either completely executed or not at all and system designers need to reason about the states after execution of such units. We prove different properties satisfied by molecules and atoms, and present algorithms to detect atoms in an ongoing computation and to force the completion of a molecule. We illustrate the uses of the developed work in application areas such as debugging, checkpointing, and reasoning about stable properties
Keywords :
distributed algorithms; distributed processing; fault tolerant computing; program debugging; reliability; atom; checkpointing; debugging; fault-tolerant distributed systems; indivisible molecule; molecule; ongoing computation; partial order; reasoning; stable properties; units of computation; Checkpointing; Computer interfaces; Computer science; Debugging; Design engineering; Distributed computing; Fault tolerant systems; Modems; Sun;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Distributed Computing Systems, 1994., Proceedings of the 14th International Conference on
Conference_Location :
Pozman
Print_ISBN :
0-8186-5840-1
Type :
conf
DOI :
10.1109/ICDCS.1994.302480
Filename :
302480
Link To Document :
بازگشت