Title :
Failure data analysis of a LAN of Windows NT based computers
Author :
Kalyanakrishnam, M. ; Kalbarczyk, Z. ; Iyer, R.
Author_Institution :
Center for Reliable & High Performance Comput., Illinois Univ., Urbana, IL, USA
Abstract :
This paper presents results of a failure data analysis of a LAN of Windows NT machines. Data for the study was obtained from event logs collected over a six-month period from the mail routing network of a commercial organization. The study focuses on characterizing causes of machine reboots. The key observations from this study are: 1) most of the problems that lead to reboots are software related; 2) rebooting the machine does not always solve the problem; 3) there are indications of propagated or correlated failures; and 4) though the average availability evaluates to over 99%, the machine downtime lasts (on average) two hours. Since the machines are dedicated mail servers, bringing down one or more of them can potentially disrupt storage, forwarding, reception and delivery of mail. This suggests that the average availability is not a good measure to characterize this type of network service
Keywords :
computer network reliability; failure analysis; local area networks; LAN; Windows NT machines; availability; failure data analysis; machine downtime; network service; rebooting; Availability; Computer networks; Data analysis; Electronic mail; Failure analysis; Local area networks; Network servers; Operating systems; Postal services; Routing;
Conference_Titel :
Reliable Distributed Systems, 1999. Proceedings of the 18th IEEE Symposium on
Conference_Location :
Lausanne
Print_ISBN :
0-7695-0290-3
DOI :
10.1109/RELDIS.1999.805094