• DocumentCode
    83794
  • Title

    Comprehensive and Efficient Design Parameter Selection for Soft Error Resilient Processors via Universal Rules

  • Author

    Duan, Lingjie ; Ying Zhang ; Bin Li ; Lu Peng

  • Author_Institution
    AMD, Inc., Austin, TX, USA
  • Volume
    63
  • Issue
    9
  • fYear
    2014
  • fDate
    Sept. 2014
  • Firstpage
    2201
  • Lastpage
    2214
  • Abstract
    Soft errors have been significantly degrading the reliability of current processors whose feature sizes and supply voltages are fast scaling down. In this paper, we propose two effective approaches to characterize processor reliability against soft errors at presilicon stage. By utilizing a rule search strategy named Patient Rule Induction Method (PRIM), we are capable of generating a set of selective rules on key design parameters. These rules quantify the design space subregion with the lowest effective soft error rate (SER), thus providing useful guidelines in designing reliable processors. Furthermore, we also propose to use Classification and Regression Trees (CART) to partition the design space into a number of small subregions each being associated with a representative SER value. This gives the processor designer a global view of the SER distribution, enabling a comprehensive analysis over the entire design space. More importantly, both approaches generate “universal” models whose effectiveness is validated with a set of test programs unseen to training. Compared to traditional application-specific design space studies, our models´ cross-program capability can save great training effort in the era of multithreading. Finally, a case study on multiprocessors is performed to simultaneously balance multiple design metrics, including reliability, performance, and power.
  • Keywords
    computer architecture; multi-threading; multiprocessing systems; program processors; radiation hardening (electronics); regression analysis; trees (mathematics); CART; PRIM; SER distribution; classification and regression trees; comprehensive analysis; cross-program capability; design metrics; design parameter selection; design space partitioning; design space subregion quantification; feature sizes; multiprocessors; multithreading; patient rule induction method; performance metrics; power metrics; presilicon stage; processor reliability characterization; reliability metrics; reliable processor design; representative SER value; rule search strategy; soft error rate; soft error resilient processors; supply voltages; test programs; universal models; universal rules; Hardware reliability; modeling and prediction; modeling of computer architecture;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/TC.2013.24
  • Filename
    6522413