DocumentCode
3295617
Title
State checksum and its role in system stabilization
Author
Huang, Chin-Tser ; Gouda, Mohamed G.
Author_Institution
Dept. of Comput. Sci.amp; Eng., South Carolina Univ., Columbia, SC, USA
fYear
2005
fDate
6-10 June 2005
Firstpage
29
Lastpage
34
Abstract
Although a self-stabilizing system that suffers from a transient fault is guaranteed to converge to a legitimate state after a finite number of steps, the convergence can be slow if the harmful effects of the fault are allowed to propagate into many processes in the system. Moreover, some safety properties of the system may be violated during the convergence. To address these problems, we propose in this paper the concept of a state checksum - a redundancy that can be added to the state of a self-stabilizing system so that some classes of faults become visible to the system, and the system can limit the propagation of their harmful effects, and maintain its safety properties during the convergence. To make these concepts concrete, we discuss the case study of a token ring and show how to use fault-detecting and fault-correcting checksums to detect visible faults, limit the propagation of their harmful effects, and ensure that the safety properties of the ring are maintained during the convergence from these faults.
Keywords
fault diagnosis; fault tolerant computing; redundancy; safety systems; system recovery; fault convergence; fault detection; fault-correcting checksum; fault-detecting checksum; self-stabilizing system; system safety; system stabilization; token ring; Computer science; Concrete; Conferences; Convergence; Distributed computing; Fault detection; Interference; Redundancy; Safety; Token networks;
fLanguage
English
Publisher
ieee
Conference_Titel
Distributed Computing Systems Workshops, 2005. 25th IEEE International Conference on
Print_ISBN
0-7695-2328-5
Type
conf
DOI
10.1109/ICDCSW.2005.128
Filename
1437153
Link To Document