Title :
Adaptive algorithms for diagnosing large-scale failures in computer networks
Author :
Tati, Srikar ; Ko, Bong Jun ; Cao, Guohong ; Swami, Ananthram ; Porta, Thomas La
Author_Institution :
Network & Security Res. Center, Pennsylvania State Univ., University Park, PA, USA
Abstract :
In this paper, we propose an algorithm to efficiently diagnose large-scale clustered failures. The algorithm, Cluster-MAX-COVERAGE (CMC), is based on greedy approach. We address the challenge of determining faults with incomplete symptoms. CMC makes novel use of both positive and negative symptoms to output a hypothesis list with a low number of false negatives and false positives quickly. CMC requires reports from about half as many nodes as other existing algorithms to determine failures with 100% accuracy. Moreover, CMC accomplishes this gain significantly faster (sometimes by two orders of magnitude) than an algorithm that matches its accuracy. Furthermore, we propose an adaptive algorithm called Adaptive-MAX-COVERAGE (AMC) that performs efficiently during both kinds of failures, i.e., independent and clustered. During a series of failues that include both independent and clustered, AMC results in a reduced number of false negatives and false positives.
Keywords :
computer network reliability; fault diagnosis; greedy algorithms; pattern clustering; AMC algorithm; CMC algorithm; adaptive algorithms; adaptive-MAX-COVERAGE algorithm; cluster-MAX-COVERAGE algorithm; computer networks; greedy approach; large-scale cluster failure diagnosis; negative symptoms; positive symptoms; Accuracy; Adaptive algorithms; Clustering algorithms; Complexity theory; Fault diagnosis; Network topology; Simulation; Clustered failures; Fault diagnosis; Incomplete information; Large-scale failures;
Conference_Titel :
Dependable Systems and Networks (DSN), 2012 42nd Annual IEEE/IFIP International Conference on
Conference_Location :
Boston, MA
Print_ISBN :
978-1-4673-1624-8
Electronic_ISBN :
1530-0889
DOI :
10.1109/DSN.2012.6263917