DocumentCode :
892780
Title :
A Software Technique for Diagnosing and Correcting Memory Errors
Author :
Liss, J.
Author_Institution :
IBM, Kingston
Volume :
35
Issue :
1
fYear :
1986
fDate :
4/1/1986 12:00:00 AM
Firstpage :
12
Lastpage :
18
Abstract :
A software diagnostic that eliminates 2-bit and some 3-bit errors is described. The diagnostic procedure tests memory for errors that cannot be corrected by ECC (error correcting code): single error correct, double error detect. When an uncorrectable error is found, the diagnostic attempts to reduce it to a I-bit error. This is done either by reconfiguring the memory to distribute failing bits across different ECC words or by replacing the failing chip with a spare. The result is that memory cards that previously had to be replaced can now continue to function. Thus, the life of memory cards can be prolonged. The diagnostic can also perform preventive maintenance when run in an alternate mode. In this mode, all combinations of the memory are tested to determine if there is reserve. Reserve is defined as: 1) The capability of reconfiguring the card to obtain another functional state of memory (in addition to the current operational state), or 2) The availability of functional spare chips that have not been used. Preventive maintenance is by replacing cards that have no reserve. Then, memory operation can continue error free.
Keywords :
Circuits; Costs; Error correction; Error correction codes; Failure analysis; Hardware; Preventive maintenance; Reliability engineering; Software design; Testing;
fLanguage :
English
Journal_Title :
Reliability, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9529
Type :
jour
DOI :
10.1109/TR.1986.4335331
Filename :
4335331
Link To Document :
بازگشت