• DocumentCode
    3349600
  • Title

    Reducing critical failures for control algorithms using executable assertions and best effort recovery

  • Author

    Vinter, Jonny ; Aidemark, Joakim ; Folkesson, Peter ; Karlsson, Johan

  • Author_Institution
    Dept. of Comput. Eng., Chalmers Univ. of Technol., Goteborg, Sweden
  • fYear
    2001
  • fDate
    1-4 July 2001
  • Firstpage
    347
  • Lastpage
    356
  • Abstract
    Systems that use f+1 computer nodes to tolerate f node failures ordinarily require that the computer nodes have strong failure semantics, i.e. a node should either produce correct results or no results at all. We show that this requirement can be relaxed for control applications, as control algorithms inherently compensate for a class of value failures. Value failures occur when an error escapes the error detection mechanisms in the computer node and an erroneous value is sent to the actuators of the control system. Fault injection experiments show that 89% of the value failures caused by bit flips in a CPU had no or minor impact on the controlled object. However, the experiments also show that 11% of the value failures had severe consequences. These failures were caused by bit flips affecting the state variables of the control algorithm. Another set of fault injection experiments showed that the percentage of value failures with severe consequences was reduced to 3% when the state variables were protected with executable assertions and best-effort recovery mechanisms.
  • Keywords
    actuators; computerised control; error detection; redundancy; software fault tolerance; system recovery; CPU bit flips; actuators; best-effort recovery mechanisms; computer node failure tolerance; control algorithms; controlled object; critical failure reduction; erroneous value; error detection mechanisms; executable assertions; failure semantics; fault injection; severe consequences; state variables; value failure compensation; Actuators; Application software; Computer errors; Control systems; Electric variables control; Error correction; Hardware; Memory management; Redundancy; Satellites;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Dependable Systems and Networks, 2001. DSN 2001. International Conference on
  • Conference_Location
    Goteborg, Sweden
  • Print_ISBN
    0-7695-1101-5
  • Type

    conf

  • DOI
    10.1109/DSN.2001.941419
  • Filename
    941419