DocumentCode
3549473
Title
A framework for node-level fault tolerance in distributed real-time systems
Author
Aidemark, Joakim ; Folkesson, Peter ; Karlsson, Johan
Author_Institution
Dept. of Safety Electron., Volvo Car Corp., Gothenburg, Sweden
fYear
2005
fDate
28 June-1 July 2005
Firstpage
656
Lastpage
665
Abstract
This paper describes a framework for achieving node-level fault tolerance (NLFT) in distributed real-time systems. The objective of NLFT is to mask errors at the node level in order to reduce the probability of node failures and thereby improve system dependability. We describe an approach called lightweight NLFT where transient faults are masked locally in the nodes by time-redundant execution of application tasks. The advantages of light-weight NLFT is demonstrated by a reliability analysis of an example brake-by-wire architecture. The results show that the use of light-weight NLFT may provide 55% higher reliability after one year and almost 60% higher MTTF, compared to using fail-silent nodes.
Keywords
distributed processing; error handling; fault tolerant computing; probability; real-time systems; reliability; brake-by-wire architecture; distributed real-time systems; lightweight NLFT; node failures; node-level fault tolerance; probability; reliability analysis; system dependability; time-redundant execution; transient faults; Computer errors; Consumer electronics; Costs; Distributed computing; Fault tolerance; Fault tolerant systems; Military computing; Real time systems; Road safety; Vehicle safety;
fLanguage
English
Publisher
ieee
Conference_Titel
Dependable Systems and Networks, 2005. DSN 2005. Proceedings. International Conference on
Print_ISBN
0-7695-2282-3
Type
conf
DOI
10.1109/DSN.2005.7
Filename
1467839
Link To Document