DocumentCode :
1411318
Title :
Deterministic voting in distributed systems using error-correcting codes
Author :
Xu, Lihao ; Bruck, Jehoshua
Author_Institution :
Dept. of Electr. Eng., California Inst. of Technol., Pasadena, CA, USA
Volume :
9
Issue :
8
fYear :
1998
fDate :
8/1/1998 12:00:00 AM
Firstpage :
813
Lastpage :
824
Abstract :
Distributed voting is an important problem in reliable computing. In an N Modular Redundant (NMR) system, the N computational modules execute identical tasks and they need to periodically vote on their current states. In this paper, we propose a deterministic majority voting algorithm for NMR systems. Our voting algorithm uses error-correcting codes to drastically reduce the average case communication complexity. In particular, we show that the efficiency of our voting algorithm can be improved by choosing the parameters of the error-correcting code to match the probability of the computational faults. For example, consider an NMR system with 31 modules, each with a state of m bits, where each module has an independent computational error probability of 10-3. 1, this NMR system, our algorithm can reduce the average case communication complexity to approximately 1.0825 m compared with the communication complexity of 31 m of the naive algorithm in which every module broadcasts its local result to all other modules. We have also implemented the voting algorithm over a network of workstations. The experimental performance results match well the theoretical predictions
Keywords :
communication complexity; distributed processing; error correction codes; fault tolerant computing; MDS code; N Modular Redundant; NMR systems; communication complexity; deterministic majority voting; distributed systems; distributed voting; error-correcting code; error-correcting codes; majority voting; network of workstations; voting algorithm; Broadcasting; Checkpointing; Complexity theory; Distributed computing; Error correction codes; Error probability; Fault tolerant systems; Nuclear magnetic resonance; Voting; Workstations;
fLanguage :
English
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9219
Type :
jour
DOI :
10.1109/71.706052
Filename :
706052
Link To Document :
بازگشت