Title :
Balancing Performance and Reliability in the Memory Hierarchy
Author :
Asadi, Ghazanfar-Hossein ; Vilas Sridharan ; Tahoori, Mehdi B. ; Kaeli, David
Author_Institution :
Dept. of Electr. & Comput. Eng., Northeastern Univ., Boston, MA
Abstract :
Cosmic-ray induced soft errors in cache memories are becoming a major threat to the reliability of microprocessor-based systems. In this paper, we present a new method to accurately estimate the reliability of cache memories. We have measured the MTTF (mean-time-to-failure) of unprotected first-level (L1) caches for twenty programs taken from SPEC2000 benchmark suite. Our results show that a 16 KB first-level cache possesses a MTTF of at least 400 years (for a raw error rate of 0.002 FIT/bit.) However, this MTTF is significantly reduced for higher error rates and larger cache sizes. Our results show that for selected programs, a 64 KB first-level cache is more than 10 times as vulnerable to soft errors versus a 16 KB cache memory. Our work also illustrates that the reliability of cache memories is highly application-dependent. Finally, we present three different techniques to reduce the susceptibility of first-level caches to soft errors by two orders of magnitude. Our analysis shows how to achieve a balance between performance and reliability
Keywords :
benchmark testing; cache storage; error analysis; microprocessor chips; performance evaluation; reliability; SPEC2000 benchmark suite; cache memories; cosmic-ray induced soft errors; mean-time-to-failure; microprocessor-based systems; unprotected first-level caches; Cache memory; Computer errors; Error analysis; Error correction codes; Hardware; Protection; Read-write memory; Redundancy; Reliability engineering; Single event upset;
Conference_Titel :
Performance Analysis of Systems and Software, 2005. ISPASS 2005. IEEE International Symposium on
Conference_Location :
Austin, TX
Print_ISBN :
0-7803-8965-4
DOI :
10.1109/ISPASS.2005.1430581