DocumentCode
1244079
Title
Cache and memory error detection, correction, and reduction techniques for terrestrial servers and workstations
Author
Slayman, Charles W.
Author_Institution
Sun Microsystems Inc., Santa Clara, CA, USA
Volume
5
Issue
3
fYear
2005
Firstpage
397
Lastpage
404
Abstract
As the size of the SRAM cache and DRAM memory grows in servers and workstations, cosmic-ray errors are becoming a major concern for systems designers and end users. Several techniques exist to detect and mitigate the occurrence of cosmic-ray upset, such as error detection, error correction, cache scrubbing, and array interleaving. This paper covers the tradeoffs of these techniques in terms of area, power, and performance penalties versus increased reliability. In most system applications, a combination of several techniques is required to meet the necessary reliability and data-integrity targets.
Keywords
DRAM chips; SRAM chips; cache storage; error correction codes; error detection codes; failure analysis; fault tolerance; network servers; radiation effects; workstations; DRAM memory; SRAM cache; array interleaving; cache scrubbing; cosmic ray errors; data integrity targets; error correction code; error detection code; fault tolerance; reduction techniques; reliability; soft error rate; terrestrial servers; terrestrial workstations; Blades; Error correction; Error correction codes; File servers; Neutrons; Power system reliability; Random access memory; SRAM chips; Switches; Workstations; Cosmic-ray upset; error correction code (ECC); memory fault tolerance; soft-error rate (SER);
fLanguage
English
Journal_Title
Device and Materials Reliability, IEEE Transactions on
Publisher
ieee
ISSN
1530-4388
Type
jour
DOI
10.1109/TDMR.2005.856487
Filename
1545899
Link To Document