DocumentCode :
3061840
Title :
Request Path Driven Model for Performance Fault Diagnoses
Author :
Tian, Guanhua ; Meng, Dan ; Li, Yong
Author_Institution :
Inst. of Comput. Technol., Chinese Acad. of Sci., Beijing, China
fYear :
2010
fDate :
6-9 Sept. 2010
Firstpage :
298
Lastpage :
305
Abstract :
Locating and diagnosing performance faults in distributed systems is crucial but challenging. Distributed systems are increasingly complex, full of various correlation and dependency, and exhibit dramatic dynamics. All these made traditional approaches prone to high false alarms. In this paper, we propose a novel system modeling technique, which encodes component´s dynamic dependencies and behavior characteristics into system´s meta-model and takes it as a unifying framework to deploy component´s sub-models. We propose an automatic analyze approach to distill, from request travel paths, request path signatures, the essential information of component´s dynamic behaviors, and use it to induce metamodel with Bayesian network, and then use the model to make fault location and diagnoses. We take up fault-injection experiments with RUBiS, a TPCW alike benchmark, simulating eBay.com. The results indicate that our model approach provides effective problem diagnosis, i.e., Bayesian network technique is effective for fault detecting and pinpointing, in terms of request tracing context. Moreover, meta-model induced with request paths, provides an effective guidance for learning statistical correlations among metrics across the system, which effectively avoid ´false alarms´ in fault pinpointing. As a case study, we construct a proactive recovery framework, which integrate our system modeling technique with software rejuvenation technique to guarantee system´s quality of services.
Keywords :
belief networks; distributed processing; software fault tolerance; Bayesian network technique; RUBiS benchmark; distributed systems; performance fault diagnosis; proactive recovery framework; request path driven model; Bayesian methods; Correlation; Fault detection; Fault location; Measurement; Modeling; Probability distribution;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing with Applications (ISPA), 2010 International Symposium on
Conference_Location :
Taipei
Print_ISBN :
978-1-4244-8095-1
Electronic_ISBN :
978-0-7695-4190-7
Type :
conf
DOI :
10.1109/ISPA.2010.70
Filename :
5634347
Link To Document :
بازگشت