DocumentCode :
2799495
Title :
An Online Mechanism to Verify Datapath Execution Using Existing Resources in Chip Multiprocessors
Author :
Rodrigues, Rance ; Kundu, Sandip
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Massachusetts at Amherst, Amherst, MA, USA
fYear :
2011
fDate :
20-23 Nov. 2011
Firstpage :
161
Lastpage :
166
Abstract :
With scaling of process technology, transistor and interconnect reliability has emerged as a growing concern for modern microprocessors. Traditional solutions for reliable operation rely on double or triple modular redundancies. However, chip multiprocessors (CMP) provide unique opportunity for low-cost data path verification for reliable operation. A recent paper presents a fault recovery scheme based on outsourcing instructions from identified faulty cores to fault free cores capable of executing them. The communication between the cores is managed via an inter-core queue (ICQ). However, no faulty core identification mechanism was presented. In this paper, we extend this research to enable self-test of the data path execution in a multicore processor. Specifically, whenever instructions are retired locally on a core (local), they are also dispatched for execution on another nearby (remote) core for execution verification via ICQ. Results obtained from local and remote cores are compared. If a fault is detected, the instruction may be re-executed on both local and remote cores to distinguish between hard and soft faults. In this study, we present results on frequency of coverage and latency between first execution and its verification. We also report performance impact of execution verification on the remote core. Results indicate that the proposed scheme is capable of remotely verifying ~80% integer ALU instructions and >;98% of other instruction types with very small impact on performance of just ~1% on the tester core and incurs less than 1% area overhead.
Keywords :
automatic testing; fault diagnosis; integrated circuit interconnections; integrated circuit reliability; microprocessor chips; multiprocessing systems; CMP; ICQ; chip multiprocessor; datapath execution self-testing; datapath execution verification; fault detection; fault free core; fault recovery scheme; faulty core identification mechanism; integer ALU instruction; interconnection reliability; intercore queue; microprocessor; multicore processor; online mechanism; transistor; Delay; Fault detection; Integrated circuit reliability; Multicore processing; Redundancy; Testing; execution datapath test; low cost test; online test; opportunistic test;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Test Symposium (ATS), 2011 20th Asian
Conference_Location :
New Delhi
ISSN :
1081-7735
Print_ISBN :
978-1-4577-1984-4
Type :
conf
DOI :
10.1109/ATS.2011.82
Filename :
6114530
Link To Document :
بازگشت