Title :
New encoding/decoding methods for designing fault-tolerant matrix operations
Author :
Tao, D.L. ; Hartmann, C.R.P. ; Han, Yunghsing S.
Author_Institution :
Dept. of Electr. Eng., State Univ. of New York, Stony Brook, NY, USA
fDate :
9/1/1996 12:00:00 AM
Abstract :
Algorithm-based fault tolerance (ABFT) can provide a low-cost error protection for array processors and multiprocessor systems. Several ABFT techniques (weighted check-sum) have been proposed to design fault-tolerant matrix operations. In these schemes, encoding/decoding uses either multiplications or divisions so that overhead is high. In this paper, new encoding/decoding methods are proposed for designing fault-tolerant matrix operations. The unique feature of these new methods is that only additions and subtractions are used in encoding/decoding. In this paper, new algorithms are proposed to construct error detecting/correcting codes with the minimum Hamming distance 3 and 4. We will show that the overhead introduced due to the incorporation of fault tolerance is drastically reduced by using these new coding schemes
Keywords :
error correction codes; error detection codes; fault tolerant computing; matrix algebra; parallel processing; algorithm-based fault tolerance; array processors; error detecting/correcting codes; error protection; fault-tolerant matrix operations; multiprocessor systems; Decoding; Design methodology; Encoding; Error correction codes; Fault detection; Fault tolerance; Fault tolerant systems; Matrix decomposition; Multiprocessing systems; Protection;
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on