• DocumentCode
    1244079
  • Title

    Cache and memory error detection, correction, and reduction techniques for terrestrial servers and workstations

  • Author

    Slayman, Charles W.

  • Author_Institution
    Sun Microsystems Inc., Santa Clara, CA, USA
  • Volume
    5
  • Issue
    3
  • fYear
    2005
  • Firstpage
    397
  • Lastpage
    404
  • Abstract
    As the size of the SRAM cache and DRAM memory grows in servers and workstations, cosmic-ray errors are becoming a major concern for systems designers and end users. Several techniques exist to detect and mitigate the occurrence of cosmic-ray upset, such as error detection, error correction, cache scrubbing, and array interleaving. This paper covers the tradeoffs of these techniques in terms of area, power, and performance penalties versus increased reliability. In most system applications, a combination of several techniques is required to meet the necessary reliability and data-integrity targets.
  • Keywords
    DRAM chips; SRAM chips; cache storage; error correction codes; error detection codes; failure analysis; fault tolerance; network servers; radiation effects; workstations; DRAM memory; SRAM cache; array interleaving; cache scrubbing; cosmic ray errors; data integrity targets; error correction code; error detection code; fault tolerance; reduction techniques; reliability; soft error rate; terrestrial servers; terrestrial workstations; Blades; Error correction; Error correction codes; File servers; Neutrons; Power system reliability; Random access memory; SRAM chips; Switches; Workstations; Cosmic-ray upset; error correction code (ECC); memory fault tolerance; soft-error rate (SER);
  • fLanguage
    English
  • Journal_Title
    Device and Materials Reliability, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1530-4388
  • Type

    jour

  • DOI
    10.1109/TDMR.2005.856487
  • Filename
    1545899