• DocumentCode
    3549473
  • Title

    A framework for node-level fault tolerance in distributed real-time systems

  • Author

    Aidemark, Joakim ; Folkesson, Peter ; Karlsson, Johan

  • Author_Institution
    Dept. of Safety Electron., Volvo Car Corp., Gothenburg, Sweden
  • fYear
    2005
  • fDate
    28 June-1 July 2005
  • Firstpage
    656
  • Lastpage
    665
  • Abstract
    This paper describes a framework for achieving node-level fault tolerance (NLFT) in distributed real-time systems. The objective of NLFT is to mask errors at the node level in order to reduce the probability of node failures and thereby improve system dependability. We describe an approach called lightweight NLFT where transient faults are masked locally in the nodes by time-redundant execution of application tasks. The advantages of light-weight NLFT is demonstrated by a reliability analysis of an example brake-by-wire architecture. The results show that the use of light-weight NLFT may provide 55% higher reliability after one year and almost 60% higher MTTF, compared to using fail-silent nodes.
  • Keywords
    distributed processing; error handling; fault tolerant computing; probability; real-time systems; reliability; brake-by-wire architecture; distributed real-time systems; lightweight NLFT; node failures; node-level fault tolerance; probability; reliability analysis; system dependability; time-redundant execution; transient faults; Computer errors; Consumer electronics; Costs; Distributed computing; Fault tolerance; Fault tolerant systems; Military computing; Real time systems; Road safety; Vehicle safety;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Dependable Systems and Networks, 2005. DSN 2005. Proceedings. International Conference on
  • Print_ISBN
    0-7695-2282-3
  • Type

    conf

  • DOI
    10.1109/DSN.2005.7
  • Filename
    1467839