DocumentCode
597225
Title
Zero-performance-overhead online fault detection and diagnosis in 3D stacked integrated circuits
Author
Safiruddin, S. ; Lefter, Mihai ; Borodin, Dmitri ; Voicu, G. ; Cotofana, Sorin D.
Author_Institution
Fac. of Electr. Eng., Math. & Comput. Sci., Delft Univ. of Technol., Delft, Netherlands
fYear
2012
fDate
4-6 July 2012
Firstpage
123
Lastpage
130
Abstract
In this paper we present a zero-performance-overhead online fault detection and diagnosis scheme that exploits the vertical proximity of hardware inherent in 3D stacked integrated circuits (3D-SIC). We consider a 3D stacked processor executing independent instruction streams from different threads, on each die. We propose the vertical clustering of functionally identical computational blocks in order to enable the utilization of the 3D specific low-latency interlayer communication infrastructure. The clustering facilitates the parallel re-execution of instructions on idle units located in the proximity of the units which initially computed them and in this way creates the means for fault diagnosis and detection. We detail the control, interconnection communication infrastructure, instruction distribution, and results processing policies required for our scheme. To determine the effectiveness of the approach, we evaluate its performance in terms of diagnosis latency and percentage of verified operations on 3 to 8 core processors implemented on 3 to 8 tier 3D-SICs, respectively, by means of simulations. Our experiments indicate that the diagnosis latency ranges from 9 to 5 cycles, for 3 to 8 cores, respectively. For transient fault detection our simulations indicate that 86% to 94% of all executed instructions are verified, for 3 to 8 cores, respectively. When only one of the layers is protected against transient faults the number of verified operations increases to 94% to 99%, for the same simulation conditions. This suggests that, if certain conditions are fulfilled at design time, our approach can completely protect one instruction stream identified as being critical for the application. Our simulations clearly indicate that the proposed scheme has the potential to improve the 3D stacked integrated circuits dependability with no performance overhead and at the expense of little area overhead.
Keywords
fault diagnosis; integrated circuit interconnections; microprocessor chips; performance evaluation; three-dimensional integrated circuits; 3D specific low-latency interlayer communication infrastructure; 3D stacked integrated circuits dependability; 3D stacked processor; 3D-SIC; core processors; diagnosis latency; functionally identical computational blocks; instruction distribution; instruction streams; interconnection communication infrastructure; parallel re-execution; performance evaluation; performance overhead; results processing policy; transient fault detection; vertical clustering; vertical proximity; zero-performance-overhead online fault detection and diagnosis; Fault detection; Hardware; Integrated circuit interconnections; Reliability; Through-silicon vias; Transient analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Nanoscale Architectures (NANOARCH), 2012 IEEE/ACM International Symposium on
Conference_Location
Amsterdam
Print_ISBN
978-1-4503-1671-2
Type
conf
Filename
6464153
Link To Document