Title :
Error and failure analysis of a UNIX server
Author :
Lal, Ronjeet ; Choi, Gwan
Author_Institution :
Dept. of Electr. Eng., Texas A&M Univ., College Station, TX, USA
Abstract :
This paper presents a measurement-based dependability study of a UNIX server. The event logs of a UNIX server are collected to form the dependability data basis. Message logs spanning approximately eleven months were collected for this study. The event log data are classified and categorized to calculate parameters such as MTBF and availability. Component analysis is also performed to identify modules that are prone to errors in the system. Next, the system error activity proceeding each system failure is analyzed to identify error patterns that may be precursors of the observed failure events. Lastly, the error/failure results from the measurement are reviewed in the perspective of the fault/error assumptions made in several popular fault injection studies
Keywords :
Unix; client-server systems; network operating systems; network servers; software reliability; MTBF; UNIX server; availability; component analysis; error analysis; event logs; failure analysis; fault injection; measurement-based dependability; message logs; system failure; Application software; Computer errors; Electric variables measurement; Failure analysis; Fault detection; Fault location; Information security; Internet; Read only memory; Safety;
Conference_Titel :
High-Assurance Systems Engineering Symposium, 1998. Proceedings. Third IEEE International
Conference_Location :
Washington, DC
Print_ISBN :
0-8186-9221-9
DOI :
10.1109/HASE.1998.731618