DocumentCode :
3333144
Title :
Design and evaluation of fault tolerance techniques for highly parallel architectures
Author :
Abraham, Jacob A.
Author_Institution :
Comput. Eng. Res. Center, Texas Univ., Austin, TX, USA
fYear :
1991
fDate :
1-2 Mar 1991
Firstpage :
12
Abstract :
Summary form only given. The author discusses fault tolerance techniques for computer systems, including a new technique, which he calls algorithm-based fault tolerance, for error detection and correction when computations are performed using multiple processor systems. The technique uses knowledge about the algorithm to reduce the amount of overhead necessary for fault tolerance. This is done by appropriately encoding the data and tailoring the algorithms to operate on the encoded data and produce encoded output data. Examples are given of applications including matrix operations, fast Fourier transforms, and computation of eigenvalues
Keywords :
error correction; error detection; fault tolerant computing; parallel architectures; FFT; algorithm-based fault tolerance; computer systems; eigenvalues; encoded data; error detection; fast Fourier transforms; highly parallel architectures; matrix operations; multiple processor systems; Application software; Circuit faults; Concurrent computing; Encoding; Fabrication; Fault tolerance; Fault tolerant systems; Integrated circuit technology; Jacobian matrices; Parallel architectures;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
VLSI, 1991. Proceedings., First Great Lakes Symposium on
Conference_Location :
Kalamazoo, MI
Print_ISBN :
0-8186-2170-2
Type :
conf
DOI :
10.1109/GLSV.1991.143934
Filename :
143934
Link To Document :
بازگشت