DocumentCode
246304
Title
Sequential Fault Monitoring
Author
Dawei Feng ; Germain, Cecile ; Nauroy, Julien
Author_Institution
LRI, Univ. Paris Sud, Paris, France
fYear
2014
fDate
8-12 Sept. 2014
Firstpage
25
Lastpage
34
Abstract
For large-scale distributed systems, the knowledge component at the core of the MAPE-K loop remains elusive. In the context of end-to-end probing, fault monitoring can be re-casted as an inference problem in the space-time domain. We propose and evaluate Sequential Matrix Factorization (SMF), a fully spatio-temporal method that exploits both the recent advances in matrix factorization for the spatial information and a new heuristics based on historical information. Adaptivity operates at two levels: algorithmically, as the exploration/exploitation trade off is controlled by a self-calibrating parameter, and at the policy level, as active learning is required for the most challenging cases of a real-world dataset.
Keywords
distributed processing; fault diagnosis; fault tolerant computing; learning (artificial intelligence); matrix decomposition; SMF; active learning; exploration-exploitation tradeoff; fully spatio-temporal method; historical information; self-calibrating parameter; sequential fault monitoring; sequential matrix factorization; spatial information; Context; Estimation; Linear programming; Monitoring; Prediction algorithms; Probes; Yttrium; Fault prediction; matrix factorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Cloud and Autonomic Computing (ICCAC), 2014 International Conference on
Conference_Location
London
Type
conf
DOI
10.1109/ICCAC.2014.17
Filename
7024041
Link To Document