DocumentCode
1804245
Title
Building a Self-Healing Operating System
Author
David, Francis M. ; Campbell, Roy H.
Author_Institution
Univ. of Illinois at Urbana-Champaign, Urbana
fYear
2007
fDate
25-26 Sept. 2007
Firstpage
3
Lastpage
10
Abstract
User applications and data in volatile memory are usually lost when an operating system crashes because of errors caused by either hardware or software faults. This is because most operating systems are designed to stop working when some internal errors are detected despite the possibility that user data and applications might still be intact and recoverable. Techniques like exception handling, code reloading, operating system component isolation, micro-rebooting, automatic system service restarts, watchdog timer based recovery and transactional components can be applied to attempt self-healing of an operating system from a wide variety of errors. Fault injection experiments show that these techniques can be used to continue running user applications after transparently recovering the operating system in a large percentage of cases. In cases where transparent recovery is not possible, individual process recovery can be attempted as a last resort.
Keywords
operating systems (computers); system recovery; automatic system service restart; code reloading; exception handling; hardware fault; micro-rebooting; self-healing operating system; software fault; transactional component; user application; volatile memory; watchdog timer based recovery; Application software; Computer bugs; Computer crashes; Computer errors; Computer science; Error correction; Error correction codes; Hardware; Operating systems; Signal processing;
fLanguage
English
Publisher
ieee
Conference_Titel
Dependable, Autonomic and Secure Computing, 2007. DASC 2007. Third IEEE International Symposium on
Conference_Location
Columbia, MD
Print_ISBN
978-0-7695-2985-1
Type
conf
DOI
10.1109/DASC.2007.22
Filename
4351383
Link To Document