DocumentCode :
822957
Title :
Reducing Data Cache Susceptibility to Soft Errors
Author :
Sridharan, Vilas ; Asadi, Hossein ; Tahoori, Mehdi B. ; Kaeli, David
Author_Institution :
Dept. of Electr. & Comput. Eng., Northeastern Univ., Boston, MA
Volume :
3
Issue :
4
fYear :
2006
Firstpage :
353
Lastpage :
364
Abstract :
Data caches are a fundamental component of most modern microprocessors. They provide for efficient read/write access to data memory. Errors occurring in the data cache can corrupt data values or state, and can easily propagate throughout the memory hierarchy. One of the main threats to data cache reliability is soft (transient, nonreproducible) errors. These errors can occur more often than hard (permanent) errors, and most often arise from single event upsets (SEUs) caused by strikes from energetic particles such as neutrons and alpha particles. Many protection techniques exist for data caches; the most common are ECC (error correcting codes) and parity. These protection techniques detect all single bit errors and, in the case of ECC, correct them. To make proper design decisions about which protection technique to use, accurate design-time modeling of cache reliability is crucial. In addition, as caches increase in storage capacity, another important goal is to reduce the failure rate of a cache, to limit disruption to normal system operation. In this paper, we present our modeling approach for assessing the impact of soft errors using architectural simulators. We also describe a new technique for reducing the vulnerability of data caches: refetching. By selectively refetching cache lines from the ECC-protected L2 cache, we can significantly reduce the vulnerability of the L1 data cache. We discuss and present results for two different algorithms that perform selective refetch. Experimental results show that we can obtain an 85 percent decrease in vulnerability when running the SPEC2K benchmark suite while only experiencing a slight decrease in performance. Our results demonstrate that selective refetch can cost-effectivety decrease the error rate of an L1 data cache
Keywords :
cache storage; error correction codes; fault tolerance; memory architecture; storage management; architectural simulator; cache line refetching; data cache reliability; data cache susceptibility; data memory; error correcting code; fault tolerance; microprocessor; parity; soft error; Alpha particles; Cache storage; Error correction; Error correction codes; Microprocessors; Neutrons; Protection; Read-write memory; Single event transient; Single event upset; Fault tolerance; cache memories; error modeling; refetch.; refresh; reliability; soft errors;
fLanguage :
English
Journal_Title :
Dependable and Secure Computing, IEEE Transactions on
Publisher :
ieee
ISSN :
1545-5971
Type :
jour
DOI :
10.1109/TDSC.2006.55
Filename :
4012647
Link To Document :
بازگشت