DocumentCode
2297110
Title
Eliminating Single Points of Failure in Software-Based Redundancy
Author
Ulbrich, Peter ; Hoffmann, Martin ; Kapitza, Rüdiger ; Lohmann, Daniel ; Schröder-Preikschat, Wolfgang ; Schmid, Reiner
Author_Institution
Wolfgang Schroder-Preikschat, Friedrich-Alexander Univ. Erlangen-Nuremberg, Erlangen, Germany
fYear
2012
fDate
8-11 May 2012
Firstpage
49
Lastpage
60
Abstract
In the domain of safety-critical embedded and cyber-physical systems, software-based redundancy is generally understood as an effective and cheap approach to improve reliability. Especially redundant execution in terms of triple modular redundancy is a well-known solution. However, triple modular redundancy (TMR) leaves unprotected single points of failure (SPOFs), such as the voter, which have to be carefully considered in all safety considerations. We present Combined Redundancy (CoRed), a holistic approach that hardens safety-critical parts of a system against soft-errors, while effectively eliminating the vulnerability caused by SPOFs. CoRed leverages redundant execution in combination with encoded processing to tackle the unprotected voting and data distribution. Its implementation does not require specific knowledge about the application and can be easily integrated into existing projects. We evaluated CoRed in a realistic setting using a quad rotor helicopter and provide experimental evidence for soft-error resistance and comparable low resource demand. In our experimental comparison plain TMR left more than seven percent of failures undetected, whereas CoRed was able to eliminate all silent data corruptions while inducing an overhead of just seven percent.
Keywords
embedded systems; error detection; failure analysis; redundancy; safety-critical software; software fault tolerance; CoRed; SPOF; TMR; combined redundancy; cyber-physical system; data distribution; embedded system; quadrotor helicopter; reliability; safety-critical system; single points of failure; soft error resistance; software-based redundancy; triple modular redundancy; unprotected voting; Actuators; Encoding; Equations; Hardware; Redundancy; Tunneling magnetoresistance; Domain-specific architectures; Fault-tolerance; Frameworks; Reliability; Soft errors; Software and System Safety;
fLanguage
English
Publisher
ieee
Conference_Titel
Dependable Computing Conference (EDCC), 2012 Ninth European
Conference_Location
Sibiu
Print_ISBN
978-1-4673-0938-7
Type
conf
DOI
10.1109/EDCC.2012.21
Filename
6214760
Link To Document