• DocumentCode
    2199009
  • Title

    Error detection mechanisms for massively parallel multiprocessors

  • Author

    Cin, M. Dal ; Hohl, W. ; Michel, E. ; Pataricza, A.

  • Author_Institution
    Math. Inst., Erlangen-Nurnberg Univ., Germany
  • fYear
    1993
  • fDate
    27-29 Jan 1993
  • Firstpage
    401
  • Lastpage
    408
  • Abstract
    A survey on the most important methods for error detection in multiprocessor systems is presented. A detailed comparison between watchdog processor and master-checker based fault tolerance is given. The fault coverage, hardware and run-time overhead are discussed, based on the experiences gained in the development of the MEMSY fault-tolerant multiprocessor system. The cumulative effects resulting from the simultaneous use of different hardware-near and high level fault-tolerance mechanisms are shown
  • Keywords
    error detection; fault tolerant computing; parallel machines; MEMSY fault-tolerant multiprocessor system; error detection mechanisms; fault coverage; hardware; massively parallel multiprocessors; master-checker based fault tolerance; run-time overhead; watchdog processor based fault tolerance; Application software; Computer architecture; Concurrent computing; Delay; Fault detection; Fault tolerance; Hardware; Multiprocessing systems; Redundancy; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing, 1993. Proceedings. Euromicro Workshop on
  • Conference_Location
    Gran Canaria
  • Print_ISBN
    0-8186-3610-6
  • Type

    conf

  • DOI
    10.1109/EMPDP.1993.336378
  • Filename
    336378