DocumentCode :
36283
Title :
Adaptive Algorithms for Diagnosing Large-Scale Failures in Computer Networks
Author :
Tati, Srikar ; Bong Jun Ko ; Guohong Cao ; Swami, Ananthram ; La Porta, Thomas F.
Author_Institution :
Inst. for Networking & Security Res., Pennsylvania State Univ., University Park, PA, USA
Volume :
26
Issue :
3
fYear :
2015
fDate :
Mar-15
Firstpage :
646
Lastpage :
656
Abstract :
We propose a greedy algorithm, Cluster-MAX-COVERAGE (CMC), to efficiently diagnose large-scale clustered failures. We primarily address the challenge of determining faults with incomplete symptoms. CMC makes novel use of both positive and negative symptoms to output a hypothesis list with a low number of false negatives and false positives quickly. CMC requires reports from about half as many nodes as other existing algorithms to determine failures with 100 percent accuracy. Moreover, CMC accomplishes this gain significantly faster (sometimes by two orders of magnitude) than an algorithm that matches its accuracy. When there are fewer positive and negative symptoms at a reporting node, CMC performs much better than existing algorithms. We also propose an adaptive algorithm called Adaptive-MAX-COVERAGE (AMC) that performs efficiently during both independent and clustered failures. During a series of failures that include both independent and clustered, AMC results in a reduced number of false negatives and false positives.
Keywords :
computer network reliability; fault diagnosis; greedy algorithms; large-scale systems; workstation clusters; AMC; CMC; adaptive algorithms; adaptive-MAX-COVERAGE; cluster-MAX-COVERAGE; false negatives; false positives; greedy algorithm; large-scale clustered failure diagnosis; Accuracy; Adaptive algorithms; Clustering algorithms; Complexity theory; Computer networks; Fault diagnosis; Network topology; Fault diagnosis; clustered failures; incomplete information; large-scale failures;
fLanguage :
English
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9219
Type :
jour
DOI :
10.1109/TPDS.2014.2311814
Filename :
6767126
Link To Document :
بازگشت