DocumentCode :
2438001
Title :
Fault Injection Campaign for a Fault Tolerant Duplex Framework
Author :
Sacco, Gian Franco ; Ferraro, Robert D. ; Von Allmen, Paul ; Rennels, Dave A.
Author_Institution :
California Inst. of Technol., Pasadena
fYear :
2007
fDate :
3-10 March 2007
Firstpage :
1
Lastpage :
9
Abstract :
Software based fault tolerance may allow the use of COTS digital electronics in building a highly reliable computing system for spacecraft. In this work we present the results of a fault injection campaign we conducted on the Duplex Framework (DF). The DF is a software developed by the UCLA group [1], [2] that allows to run two copies (or replicas) of the same program on two different nodes of a commercial off-the-shelf (COTS) computer cluster. By the means of a third process (comparator) running on a different node that constantly monitors the results computed by the two replicas, the DF is able to restart the two replica processes if an inconsistency in their computation is detected. In order to test the reliability of the DF we wrote a simple fault injector that injects faults in the virtual memory of one of the replica process to simulate the effects of radiation in space. These faults occasionally cause the process to crash or produce erroneous outputs. For this study we used two different applications, one that computes an encryption of a input file using the RSA algorithm, and another that optimizes the trade-off between time spent and the fuel consumption for a low-thrust orbit transfer. But the DF is generic enough that any application written in C or Fortran could be used with little or no modification of the original source code. Our results show the potential of such approach in detecting and recovering from radiation induced random errors. This approach is very cost efficient compared to hardware implemented duplex operations and can be adopted to control processes on spacecrafts where the fault rate produced by cosmic rays is not very high.
Keywords :
aerospace computing; software fault tolerance; space vehicles; COTS digital electronics; commercial off-the-shelf computer cluster; cosmic rays; duplex framework; fault injection campaign; fault tolerant duplex framework; low-thrust orbit transfer; software based fault tolerance; spacecraft computing system; virtual memory; Aerospace electronics; Computational modeling; Computer applications; Computer crashes; Computer displays; Cryptography; Fault tolerance; Fault tolerant systems; Space vehicles; Vehicle crash testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Aerospace Conference, 2007 IEEE
Conference_Location :
Big Sky, MT
ISSN :
1095-323X
Print_ISBN :
1-4244-0524-6
Electronic_ISBN :
1095-323X
Type :
conf
DOI :
10.1109/AERO.2007.352648
Filename :
4161526
Link To Document :
بازگشت