Title :
A new class of array codes for memory storage
Author :
Lastras-Montano, L.A. ; Meaney, P.J. ; Stephens, E. ; Trager, B.M. ; Connor, J.O. ; Alves, L.C.
Author_Institution :
IBM T.J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
In this article we describe a class of error control codes called “diff-MDS” codes that are custom designed for highly resilient computer memory storage. The error scenarios of concern range from simple single bit errors, to memory chip failures and catastrophic memory module failures. Our approach to building codes for this setting relies on the concept of expurgating a parity code that is easy to decode for memory module failures so that a few additional small errors can be handled as well, thus preserving most of the decoding complexity advantages of the original code while extending its original intent. The manner in which we expurgate is carefully crafted so that the strength of the resulting code is comparable to that of a Reed-Solomon code when used for this particular setting. An instance of this class of algorithms has been incorporated in IBM´s zEnterprise mainframe offering, setting a new industry standard for memory resiliency.
Keywords :
DRAM chips; Reed-Solomon codes; circuit complexity; decoding; error correction codes; error statistics; fault diagnosis; mainframes; memory architecture; parity check codes; DRAM; Reed-Solomon code; array code; catastrophic memory module failure; computer memory storage; decoding complexity; diff-MDS code; dynamic random access memory; error control code; mainframe; memory chip failure; memory resiliency; parity code; single bit error; Arrays; Complexity theory; Decoding; Error correction; Parity check codes; Redundancy; Reed-Solomon codes;
Conference_Titel :
Information Theory and Applications Workshop (ITA), 2011
Conference_Location :
La Jolla, CA
Print_ISBN :
978-1-4577-0360-7
DOI :
10.1109/ITA.2011.5743586