DocumentCode
2845694
Title
Optimizing Issue Queue Reliability to Soft Errors on Simultaneous Multithreaded Architectures
Author
Fu, Xin ; Zhang, Wangyuan ; Li, Tao ; Fortes, José
Author_Institution
Dept. of ECE, Univ. of Florida, Gainesville, FL
fYear
2008
fDate
9-12 Sept. 2008
Firstpage
190
Lastpage
197
Abstract
The issue queue (IQ) is a key microarchitecture structure for exploiting instruction-level and thread-level parallelism in dynamically scheduled simultaneous multithreaded (SMT) processors. However, exploiting more parallelism yields high susceptibility to transient faults on a conventional IQ. With the rapidly increasing soft error rates, the IQ is likely to be a reliability hot-spot on SMT processors fabricated with advanced technology nodes using smaller and denser transistors with lower threshold voltages and tighter noise margins. In this paper, we explore microarchitecture techniques to optimize IQ reliability to soft error on SMT architectures. We propose to use off-line instruction vulnerability profiling to identify reliability critical instructions. The gathered information is then used to guide reliability-aware instruction scheduling and resource allocation in multithreaded execution environments. We evaluate the efficiency of the proposed schemes across various SMT workload mixes. Extensive simulation results show that, on average, our microarchitecture level soft error mitigation techniques can significantly reduce IQ vulnerability by 42% with 1% performance improvement. To maintain runtime IQ reliability for pre-defined thresholds, we propose dynamic vulnerability management (DVM) mechanisms. Experimental results show that our DVM techniques can effectively achieve desired reliability/performance tradeoffs.
Keywords
multi-threading; multiprocessing systems; parallel architectures; scheduling; SMT processors; dynamic vulnerability management; dynamically scheduled simultaneous multithreaded processors; instruction-level parallelism; issue queue reliability; multithreaded execution environments; reliability-aware instruction scheduling; resource allocation; simultaneous multithreaded architectures; thread-level parallelism; Computer errors; Dynamic scheduling; Error analysis; Microarchitecture; Processor scheduling; Resource management; Surface-mount technology; Threshold voltage; Transistors; Working environment noise;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel Processing, 2008. ICPP '08. 37th International Conference on
Conference_Location
Portland, OR
ISSN
0190-3918
Print_ISBN
978-0-7695-3374-2
Electronic_ISBN
0190-3918
Type
conf
DOI
10.1109/ICPP.2008.23
Filename
4625849
Link To Document