Title :
Fault-tolerant clock synchronization in large multicomputer systems
Author :
Olson, Alan ; Shin, Kang G.
Author_Institution :
Real-Time Comput. Lab., Michigan Univ., Ann Arbor, MI, USA
fDate :
9/1/1994 12:00:00 AM
Abstract :
The cost of synchronizing a multicomputer increases with system size. For large multicomputers, the time and resources spent to enable each node to estimate the clock value of every other node in the system can be prohibitive. We show how to reduce the cost of synchronization by assigning each node to one or more groups, then having each node estimate the clock values of only those nodes with which it shares a group. Since each node estimates the clock value of only a subset of the nodes, the cost of synchronization can be significantly reduced. We also provide a method for computing the maximum skew between any two nodes in the multicomputer, and a method for computing the maximum time between synchronizations. We also show how the fault tolerance of the synchronization algorithm may be determined
Keywords :
clocks; fault tolerant computing; multiprocessing systems; reliability; synchronisation; clock drift; clock skew; clock value; fault tolerance; fault-tolerant clock synchronization; large multicomputer systems; maximum skew; maximum time; synchronization algorithm; Atomic clocks; Control systems; Costs; Current measurement; Energy consumption; Fault tolerance; Fault tolerant systems; Hardware; Synchronization; Time measurement;
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on