DocumentCode :
2344262
Title :
Functional correctness for CMP interconnects
Author :
Abdel-Khalek, Rawan ; Parikh, Ritesh ; DeOrio, Andrew ; Bertacco, Valeria
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of Michigan, Ann Arbor, MI, USA
fYear :
2011
fDate :
9-12 Oct. 2011
Firstpage :
352
Lastpage :
359
Abstract :
As transistor counts continue to scale, modern designs are transitioning towards large chip multi-processors (CMPs). In order to match the advancing performance of CMPs, on-chip interconnects are becoming increasingly complex, commonly deploying advanced network-on-chip (NoC) structures. Ensuring the correct operation of these system-level infrastructures has become increasingly problematic and, in order to avoid the potential for functional design errors manifesting into the final product, there is a need for mechanisms to safeguard communication integrity at runtime. In this paper, we propose SafeNoC, an end-to-end error detection and recovery solution to ensure the functional correctness of CMP interconnects. SafeNoC augments the existing interconnect with a simple, lightweight checker network that is guaranteed to deliver messages correctly. For each data message sent over the primary NoC, a look-ahead signature is transmitted over the checker network and is used to detect errors in the corresponding data message. If a functional communication bug is detected, a novel recovery algorithm reconstructs the data that was in flight at the time of the error occurrence, ensuring that it reaches the intended destination. In our experiments, we found that SafeNoC can recover from a wide variety of errors, with almost no performance impact in the absence of errors. A lightweight solution, SafeNoC occupies a 2.41% area overhead in a 64-core CMP, 7× smaller than common retransmission-based approaches.
Keywords :
integrated circuit interconnections; multiprocessing systems; network-on-chip; CMP interconnects; SafeNoC; chip multiprocessors; data message; end-to-end error detection; functional communication bug; functional correctness; functional design error; lightweight checker network; look-ahead signature; network-on-chip; on-chip interconnects; system-level infrastructure; transistor counts; Computer bugs; Hardware; Routing protocols; Runtime; Software; System recovery; Topology;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Design (ICCD), 2011 IEEE 29th International Conference on
Conference_Location :
Amherst, MA
ISSN :
1063-6404
Print_ISBN :
978-1-4577-1953-0
Type :
conf
DOI :
10.1109/ICCD.2011.6081423
Filename :
6081423
Link To Document :
بازگشت