• DocumentCode
    1504431
  • Title

    Fault-containment in cache memories for TMR redundant processor systems

  • Author

    Chen, Chung-Ho ; Somani, Arun K.

  • Author_Institution
    Dept. of Electron. Eng., Nat. Yunlin Univ. of Sci. & Technol., Touliu, Taiwan
  • Volume
    48
  • Issue
    4
  • fYear
    1999
  • fDate
    4/1/1999 12:00:00 AM
  • Firstpage
    386
  • Lastpage
    397
  • Abstract
    Cache data errors read by a processor may cause CPU control flow error and force the system to enter a CPU-cache reintegration process in redundant processor systems. The reintegration process degrades the system performance and reliability. To reduce the occurrences of such an event, we propose a real-time error recovery scheme that provides effective fault-containment for data errors in cache memories. The scheme is based on cache data broadcasting of a dirty line after modification. It effectively exploits the redundancy of a fault-tolerant system using hardware voting. The scheme recovers from erroneous cache data written by a processor with full coverage. This error recovery feature remedies the insufficiency of error-correcting codes that are unable to prevent such an error. In addition, more than 60 percent of cache lines are fully covered for recovery due to errors originated from the cache itself, including unrecoverable ECC errors. The protocol can also be used to speedup the CPU-cache reintegration process for a temporarily failed processor. The performance overhead of the protocol is to broadcast only 2-3 percent of the total memory references
  • Keywords
    cache storage; fault tolerant computing; redundancy; TMR redundant processor systems; cache data errors; cache memories; fault-containment; fault-tolerant system; real-time error recovery; redundancy; Broadcasting; Cache memory; Control systems; Degradation; Error correction; Error correction codes; Force control; Protocols; Redundancy; System performance;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/12.762529
  • Filename
    762529