DocumentCode
1736546
Title
A repetitive fault tolerance model for parallel programs
Author
Yen, I-Ling ; Leiss, Ernst L. ; Bastani, Farokh B.
Author_Institution
Dept. of Comput. Sci., Houston, Univ., TX, USA
fYear
1993
Firstpage
447
Abstract
The authors propose a repetitive fault tolerance (RFT) model, which provides an environment for the systematic development of fault tolerant parallel programs. RFT programs can tolerate processor failures without sacrificing performance. The system gives an optimal performance when all the processors are working while continuing to work, though at a lower performance, when failure occurs. Also, the system works as long as there is at least one working processor. Thus, it not only provides a software solution to achieve a highly reliable parallel computation environment but also provides an elegant solution for constructing reliable nonrepairable systems. The model is applied to three examples to illustrate the construction procedure and to evaluate the performance of repetitive fault tolerant programs as well as to demonstrate the applicability of this model
Keywords
fault tolerant computing; parallel programming; performance evaluation; programming environments; nonrepairable systems; optimal performance; parallel computation environment; parallel programs; processor failures; repetitive fault tolerance model; Application software; Checkpointing; Computer science; Degradation; Fault tolerance; Fault tolerant systems; Hardware; Redundancy; Space exploration; Very large scale integration;
fLanguage
English
Publisher
ieee
Conference_Titel
System Sciences, 1993, Proceeding of the Twenty-Sixth Hawaii International Conference on
Conference_Location
Wailea, HI
Print_ISBN
0-8186-3230-5
Type
conf
DOI
10.1109/HICSS.1993.284081
Filename
284081
Link To Document