DocumentCode
628209
Title
Fault detection and localization in distributed systems using invariant relationships
Author
Sharma, Abhishek B. ; Haifeng Chen ; Min Ding ; Yoshihira, K. ; Guofei Jiang
Author_Institution
NEC Labs. America, Princeton, NJ, USA
fYear
2013
fDate
24-27 June 2013
Firstpage
1
Lastpage
8
Abstract
Recent advances in sensing and communication technologies enable us to collect round-the-clock monitoring data from a wide-array of distributed systems including data centers, manufacturing plants, transportation networks, automobiles, etc. Often this data is in the form of time series collected from multiple sensors (hardware as well as software based). Previously, we developed a time-invariant relationships based approach that uses Auto-Regressive models with eXogenous input (ARX) to model this data. A tool based on our approach has been effective for fault detection and capacity planning in distributed systems. In this paper, we first describe our experience in applying this tool in real-world settings. We also discuss the challenges in fault localization that we face when using our tool, and present two approaches - a spatial approach based on invariant graphs and a temporal approach based on expected broken invariant patterns - that we developed to address this problem.
Keywords
autoregressive processes; distributed processing; fault diagnosis; graph theory; sensor fusion; time series; ARX; auto-regressive model with exogenous input; capacity planning; communication technologies; distributed systems; expected broken invariant patterns; fault detection; fault localization; invariant graphs; sensing technologies; spatial approach; temporal approach; time series; time-invariant relationship based approach; Data models; Monitoring; Noise; Servers; Time measurement; Time series analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Dependable Systems and Networks (DSN), 2013 43rd Annual IEEE/IFIP International Conference on
Conference_Location
Budapest
ISSN
1530-0889
Print_ISBN
978-1-4673-6471-3
Type
conf
DOI
10.1109/DSN.2013.6575304
Filename
6575304
Link To Document