• DocumentCode
    2799495
  • Title

    An Online Mechanism to Verify Datapath Execution Using Existing Resources in Chip Multiprocessors

  • Author

    Rodrigues, Rance ; Kundu, Sandip

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Univ. of Massachusetts at Amherst, Amherst, MA, USA
  • fYear
    2011
  • fDate
    20-23 Nov. 2011
  • Firstpage
    161
  • Lastpage
    166
  • Abstract
    With scaling of process technology, transistor and interconnect reliability has emerged as a growing concern for modern microprocessors. Traditional solutions for reliable operation rely on double or triple modular redundancies. However, chip multiprocessors (CMP) provide unique opportunity for low-cost data path verification for reliable operation. A recent paper presents a fault recovery scheme based on outsourcing instructions from identified faulty cores to fault free cores capable of executing them. The communication between the cores is managed via an inter-core queue (ICQ). However, no faulty core identification mechanism was presented. In this paper, we extend this research to enable self-test of the data path execution in a multicore processor. Specifically, whenever instructions are retired locally on a core (local), they are also dispatched for execution on another nearby (remote) core for execution verification via ICQ. Results obtained from local and remote cores are compared. If a fault is detected, the instruction may be re-executed on both local and remote cores to distinguish between hard and soft faults. In this study, we present results on frequency of coverage and latency between first execution and its verification. We also report performance impact of execution verification on the remote core. Results indicate that the proposed scheme is capable of remotely verifying ~80% integer ALU instructions and >;98% of other instruction types with very small impact on performance of just ~1% on the tester core and incurs less than 1% area overhead.
  • Keywords
    automatic testing; fault diagnosis; integrated circuit interconnections; integrated circuit reliability; microprocessor chips; multiprocessing systems; CMP; ICQ; chip multiprocessor; datapath execution self-testing; datapath execution verification; fault detection; fault free core; fault recovery scheme; faulty core identification mechanism; integer ALU instruction; interconnection reliability; intercore queue; microprocessor; multicore processor; online mechanism; transistor; Delay; Fault detection; Integrated circuit reliability; Multicore processing; Redundancy; Testing; execution datapath test; low cost test; online test; opportunistic test;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Test Symposium (ATS), 2011 20th Asian
  • Conference_Location
    New Delhi
  • ISSN
    1081-7735
  • Print_ISBN
    978-1-4577-1984-4
  • Type

    conf

  • DOI
    10.1109/ATS.2011.82
  • Filename
    6114530