• DocumentCode
    1736546
  • Title

    A repetitive fault tolerance model for parallel programs

  • Author

    Yen, I-Ling ; Leiss, Ernst L. ; Bastani, Farokh B.

  • Author_Institution
    Dept. of Comput. Sci., Houston, Univ., TX, USA
  • fYear
    1993
  • Firstpage
    447
  • Abstract
    The authors propose a repetitive fault tolerance (RFT) model, which provides an environment for the systematic development of fault tolerant parallel programs. RFT programs can tolerate processor failures without sacrificing performance. The system gives an optimal performance when all the processors are working while continuing to work, though at a lower performance, when failure occurs. Also, the system works as long as there is at least one working processor. Thus, it not only provides a software solution to achieve a highly reliable parallel computation environment but also provides an elegant solution for constructing reliable nonrepairable systems. The model is applied to three examples to illustrate the construction procedure and to evaluate the performance of repetitive fault tolerant programs as well as to demonstrate the applicability of this model
  • Keywords
    fault tolerant computing; parallel programming; performance evaluation; programming environments; nonrepairable systems; optimal performance; parallel computation environment; parallel programs; processor failures; repetitive fault tolerance model; Application software; Checkpointing; Computer science; Degradation; Fault tolerance; Fault tolerant systems; Hardware; Redundancy; Space exploration; Very large scale integration;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    System Sciences, 1993, Proceeding of the Twenty-Sixth Hawaii International Conference on
  • Conference_Location
    Wailea, HI
  • Print_ISBN
    0-8186-3230-5
  • Type

    conf

  • DOI
    10.1109/HICSS.1993.284081
  • Filename
    284081