DocumentCode :
2150380
Title :
Modeling and analysis of fault-tolerant distributed memories for Networks-on-Chip
Author :
BanaiyanMofrad, Abbas ; Dutt, Nikil ; Girao, Gustavo
Author_Institution :
Center for Embedded Computer Systems, University of California, Irvine, USA
fYear :
2013
fDate :
18-22 March 2013
Firstpage :
1605
Lastpage :
1608
Abstract :
Advances in technology scaling increasingly make Network-on-Chips (NoCs) more susceptible to failures that cause various reliability challenges. With increasing area occupied by different on-chip memories, strategies for maintaining fault-tolerance of distributed on-chip memories become a major design challenge. We propose a system-level design methodology for scalable fault-tolerance of distributed on-chip memories in NoCs. We introduce a novel reliability clustering model for fault-tolerance analysis and shared redundancy management of on-chip memory blocks. We perform extensive design space exploration applying the proposed reliability clustering on a block-redundancy fault-tolerant scheme to evaluate the tradeoffs between reliability, performance, and overheads. Evaluations on a 64-core chip multiprocessor (CMP) with an 8x8 mesh NoC show that distinct strategies of our case study may yield up to 20% improvements in performance gains and 25% improvement in energy savings across different benchmarks, and uncover interesting design configurations.
Keywords :
Analytical models; Fault tolerant systems; Redundancy; Reliability engineering; System-on-chip;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Design, Automation & Test in Europe Conference & Exhibition (DATE), 2013
Conference_Location :
Grenoble, France
ISSN :
1530-1591
Print_ISBN :
978-1-4673-5071-6
Type :
conf
DOI :
10.7873/DATE.2013.326
Filename :
6513772
Link To Document :
بازگشت