DocumentCode :
709260
Title :
C´Mon: a predictable monitoring infrastructure for system-level latent fault detection and recovery
Author :
Jiguo Song ; Parmer, Gabriel
Author_Institution :
George Washington Univ., Washington, DC, USA
fYear :
2015
fDate :
13-16 April 2015
Firstpage :
247
Lastpage :
258
Abstract :
Embedded and real-time systems must balance between many often conflicting goals including predictability, high utilization, efficiency, reliability, and SWaP (size, weight, and power). Reliability is particularly difficult to achieve without significantly impacting the other factors. Though reliability solutions exist for application-level, they are invalidated by system-level faults that are particularly difficult to detect and recover from. This paper presents the C´Mon system for predictably and efficiently monitoring system-level execution, and validating that it conforms with the high-level analytical models that underlie the timing guarantees of the system. Latent faults such as timing errors, incorrect scheduler decisions, unbounded priority inversions, or deadlocks are detected, the faulty component is identified, and using previous work in system recovery, the system is brought back to a stable state - all without missing deadlines.
Keywords :
embedded systems; fault diagnosis; program diagnostics; system recovery; C´Mon system; deadlocks; embedded system; incorrect scheduler decisions; predictable monitoring infrastructure; real-time systems; system timing guarantees; system-level execution monitoring; system-level latent fault detection; system-level latent fault recovery; timing errors; unbounded priority inversions; Computational modeling; Fault tolerant systems; Instruction sets; Monitoring; Real-time systems; Synchronization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Real-Time and Embedded Technology and Applications Symposium (RTAS), 2015 IEEE
Conference_Location :
Seattle, WA
Type :
conf
DOI :
10.1109/RTAS.2015.7108448
Filename :
7108448
Link To Document :
بازگشت