DocumentCode
1234984
Title
Availability requirement for a fault-management server in high-availability communication systems
Author
Sun, Hairong ; Han, James J. ; Levendel, Haim
Author_Institution
High Reliability & Availability Technol. Center, Motorola, Deer Park, IL, USA
Volume
52
Issue
2
fYear
2003
fDate
6/1/2003 12:00:00 AM
Firstpage
238
Lastpage
244
Abstract
This paper investigates the availability requirement for the fault management server in high-availability communication systems. This study shows that the availability of the fault management server does not need to be 99.999% in order to guarantee a 99.999% system availability, as long as the fail-safe ratio (the probability that the failure of the fault management server does not bring down the system) and the fault coverage ratio (probability that the failure in the system can be detected and recovered by the fault management server) are sufficiently high. Tradeoffs can be made among the availability of the fault management server, the fail-safe ratio, and the fault coverage ratio to optimize system availability. A cost-effective design for the fault management server is proposed.
Keywords
Markov processes; computer network management; network servers; probability; telecommunication network management; telecommunication network reliability; Markov model; availability requirement; fail-safe ratio; fault coverage ratio; fault management server failure; fault-management server; high-availability communication systems; probability; Availability; Cost function; Fault detection; Fault tolerance; Network servers; Power supplies; Power system management; Software performance; Sun; Telecommunication traffic;
fLanguage
English
Journal_Title
Reliability, IEEE Transactions on
Publisher
ieee
ISSN
0018-9529
Type
jour
DOI
10.1109/TR.2003.812624
Filename
1211116
Link To Document