Title :
Microarchitecture soft error vulnerability characterization and mitigation under 3D integration technology
Author :
Zhang, Wangyuan ; Li, Tao
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Florida, Gainesville, FL
Abstract :
As semiconductor processing techniques continue to scale down, transient faults, also known as soft errors, are increasingly becoming a reliability threat to high-performance microprocessors fabricated using state-of-the-art CMOS technologies. Emerging 3D chip integration techniques leverage vertically stacked structures to reduce on-chip wire delay and have shown the capability of overcoming interconnect bottlenecks as well as reducing power consumption. While the benefits of 3D die stacking on microprocessor performance and power have been extensively investigated recently, its implication on transient fault susceptibility is largely unknown. In this work, we make the first attempt to characterize microarchitecture soft error vulnerabilities across the stacked chip layers under 3D integration technologies. Using models and simulations that capture soft error physical mechanism and circuit/architecture level impact, our study reveals the opportunities of leveraging 3D integration (e.g. the structure of vertical stacking and the incorporation of heterogeneous process technologies) to achieve enhanced reliability. We showcase that the first characteristic allows outer-layers to shield inter-layers from particle strikes and the second feature enables the deployment of error resilience device techniques (e.g. Silicon-On-Insulator) on vulnerable layers to achieve a reliability target while minimizing manufacturing cost. We further propose a set of microarchitecture techniques which can effectively exploit the reliability benefits offered by 3D technologies. For example, we propose the scheduling of vulnerable in-flight instructions to reliable layers and design robust register files by combing reliability-hardened circuits, program value vulnerability and 3D integration techniques. Experimental results show that these techniques are able to substantially reduce 3D microarchitecturespsila soft error rate by up to 88% compared to a planar design. We further evaluate the t- - hermal implication of the proposed techniques and conclude that their impact on chip temperature is negligible.
Keywords :
CMOS integrated circuits; computer architecture; fault diagnosis; integrated circuit interconnections; integrated circuit reliability; microprocessor chips; scheduling; silicon-on-insulator; 3D chip integration techniques; 3D integration technology; CMOS technology; die stacking; error resilience device techniques; high-performance microprocessors; in-flight instructions; interconnect bottlenecks; microarchitecture soft error vulnerability characterization; microarchitecture techniques; microprocessor performance; on-chip wire delay; power consumption; program value vulnerability; reliability target; reliability threat; reliability-hardened circuits; robust register files; scheduling; semiconductor processing techniques; silicon-on-insulator; soft error physical mechanism; stacked chip layers; transient fault susceptibility; transient faults; CMOS process; CMOS technology; Circuit faults; Integrated circuit reliability; Microarchitecture; Microprocessors; Semiconductor device reliability; Silicon on insulator technology; Stacking; Wire;
Conference_Titel :
Microarchitecture, 2008. MICRO-41. 2008 41st IEEE/ACM International Symposium on
Conference_Location :
Lake Como
Print_ISBN :
978-1-4244-2836-6
Electronic_ISBN :
1072-4451
DOI :
10.1109/MICRO.2008.4771811