DocumentCode
2344424
Title
Adaptive execution assistance for multiplexed fault-tolerant chip multiprocessors
Author
Subramanyan, Pramod ; Singh, Virendra ; Saluja, Kewal K. ; Larsson, Erik
Author_Institution
Princeton Univ., Princeton, NJ, USA
fYear
2011
fDate
9-12 Oct. 2011
Firstpage
419
Lastpage
426
Abstract
Relentless scaling of CMOS fabrication technology has made contemporary integrated circuits increasingly susceptible to transient faults, wearout-related permanent faults, intermittent faults and process variations. Therefore, mechanisms to mitigate the effects of decreased reliability are expected to become essential components of future general-purpose microprocessors. In this paper, we introduce a new throughput-efficient architecture for multiplexed fault-tolerant chip multiprocessors (CMPs). Our proposal relies on the new technique of adaptive execution assistance, which dynamically varies instruction outcomes forwarded from the leading core to the trailing core based on measures of trailing core performance. We identify policies and design low overhead hardware mechanisms to achieve this. Our work also introduces a new priority-based thread-scheduling algorithm for multiplexed architectures that improves multiplexed fault tolerant CMP throughput by prioritizing stalled threads. Through simulation-based evaluation, we And that our proposal delivers 17.2% higher throughput than perfect dual modular redundant (DMR) execution and outperforms previous proposals for throughput-efficient CMP architectures.
Keywords
CMOS integrated circuits; circuit simulation; computer architecture; fault tolerant computing; integrated circuit reliability; microprocessor chips; multi-threading; CMOS fabrication technology; CMP; adaptive execution assistance technique; integrated circuit; intermittent fault; low overhead hardware mechanism; multiplexed fault tolerant chip multiprocessors; priority based thread scheduling algorithm; simulation based evaluation; throughput-efficient architecture; transient fault; wearout related permanent fault; Lead; Multiplexing; Reliability; Throughput;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Design (ICCD), 2011 IEEE 29th International Conference on
Conference_Location
Amherst, MA
ISSN
1063-6404
Print_ISBN
978-1-4577-1953-0
Type
conf
DOI
10.1109/ICCD.2011.6081432
Filename
6081432
Link To Document