Abstract :
A software diagnostic that eliminates 2-bit and some 3-bit errors is described. The diagnostic procedure tests memory for errors that cannot be corrected by ECC (error correcting code): single error correct, double error detect. When an uncorrectable error is found, the diagnostic attempts to reduce it to a I-bit error. This is done either by reconfiguring the memory to distribute failing bits across different ECC words or by replacing the failing chip with a spare. The result is that memory cards that previously had to be replaced can now continue to function. Thus, the life of memory cards can be prolonged. The diagnostic can also perform preventive maintenance when run in an alternate mode. In this mode, all combinations of the memory are tested to determine if there is reserve. Reserve is defined as: 1) The capability of reconfiguring the card to obtain another functional state of memory (in addition to the current operational state), or 2) The availability of functional spare chips that have not been used. Preventive maintenance is by replacing cards that have no reserve. Then, memory operation can continue error free.