DocumentCode :
1619829
Title :
JAGR: an autonomous self-recovering application server
Author :
Candea, George ; Kiciman, Emre ; Zhang, Steve ; Keyani, Pedram ; Fox, Armando
Author_Institution :
Comput. Syst. Lab., Stanford Univ., CA, USA
fYear :
2003
fDate :
6/25/2003 12:00:00 AM
Firstpage :
168
Lastpage :
177
Abstract :
This paper demonstrates that the dependability of generic, evolving J2EE applications can be enhanced through a combination of a few recovery-oriented techniques. Our goal is to reduce downtime by automatically and efficiently recovering from a broad class of transient software failures without having to modify applications. We describe here the integration of three new techniques into JBoss, an open-source J2EE application server. The resulting system is JAGR-JBoss with application-generic recovery - a self-recovering execution platform. JAGR combines application-generic failure-path inference (AFPI), path-based failure detection, and micro-reboots. AFPI uses controlled fault injection and observation to infer paths that faults follow through a J2EE application. Path-based failure detection uses tagging of client requests and statistical analysis to identify anomalous component behavior. Micro-reboots are fast reboots we perform at the sub-application level to recover components from transient failures; by selectively rebooting only those components that are necessary to repair the failure, we reduce recovery time. These techniques are designed to be autonomous and application-generic, making them well suited to the rapidly changing software of Internet services.
Keywords :
Java; fault tolerant computing; inference mechanisms; middleware; network servers; open systems; system recovery; AFPI; Internet service; JAGR-JBoss; anomalous component behavior identification; application-generic failure-path inference; application-generic recovery; autonomous self-recovering application server; client request tagging; component recovery; controlled fault injection; downtime reduction; evolving J2EE application; fault path inference; microreboot; open-source J2EE application server; path-based failure detection; recovery time reduction; recovery-oriented technique; selective rebooting; self-recovering execution platform; statistical analysis; transient failure; transient software failure; Application software; Conferences; Hardware; Java; Middleware; Open source software; Runtime; Statistical analysis; Tagging; Web and internet services;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Autonomic Computing Workshop. 2003. Proceedings of the
Print_ISBN :
0-7695-1983-0
Type :
conf
DOI :
10.1109/ACW.2003.1210217
Filename :
1210217
Link To Document :
بازگشت