• DocumentCode
    864486
  • Title

    Transient-fault recovery for chip multiprocessors

  • Author

    Gomaa, Mohamed A. ; Scarbrough, Chad ; Vijaykumar, T.N. ; Pomeranz, Irith

  • Author_Institution
    Purdue Univ., West Lafayette, IN, USA
  • Volume
    23
  • Issue
    6
  • fYear
    2003
  • Firstpage
    76
  • Lastpage
    83
  • Abstract
    Chip-level redundant threading with recovery (CRTR) for chip multiprocessors extends previous transient-fault detection schemes to provide fault recovery. To hide interprocessor latency, CRTR uses a long slack enabled by asymmetric commit and uses the trailing thread state for recovery. CRTR increases bandwidth supply by pipelining communication paths and reduces bandwidth demand by extending the dependence-based checking elision.
  • Keywords
    fault tolerant computing; microprocessor chips; multiprocessing systems; chip multiprocessors; chip-level redundant threading with recovery; communication paths; dependence-based checking elision; interprocessor latency; transient fault recovery; transient-fault detection; Bandwidth; Cathode ray tubes; Error correction codes; Fault detection; Fault tolerance; Microprocessors; Multithreading; Protection; Registers; Yarn;
  • fLanguage
    English
  • Journal_Title
    Micro, IEEE
  • Publisher
    ieee
  • ISSN
    0272-1732
  • Type

    jour

  • DOI
    10.1109/MM.2003.1261390
  • Filename
    1261390