DocumentCode
2349464
Title
A methodology for detection and estimation of software aging
Author
Garg, Sachin ; Van Moorsel, Aad ; Vaidyanathan, Kalyanaraman ; Trivedi, Kishor S.
Author_Institution
AT&T Bell Labs., Murray Hill, NJ, USA
fYear
1998
fDate
4-7 Nov 1998
Firstpage
283
Lastpage
292
Abstract
The phenomenon of software aging refers to the accumulation of errors during the execution of the software which eventually results in it´s crash/hang failure. A gradual performance degradation may also accompany software aging. Pro-active fault management techniques such as “software rejuvenation” (Y. Huang et al., 1995) may be used to counteract aging if it exists. We propose a methodology for detection and estimation of aging in the UNIX operating system. First, we present the design and implementation of an SNMP based, distributed monitoring tool used to collect operating system resource usage and system activity data at regular intervals, from networked UNIX workstations. Statistical trend detection techniques are applied to this data to detect/validate the existence of aging. For quantifying the effect of aging in operating system resources, we propose a metric: “estimated time to exhaustion”, which is calculated using well known slope estimation techniques. Although the distributed data collection tool is specific to UNIX, the statistical techniques can be used for detection and estimation of aging in other software as well
Keywords
Unix; software fault tolerance; software maintenance; system monitoring; SNMP based distributed monitoring tool; UNIX operating system; distributed data collection tool; error accumulation; estimated time to exhaustion; networked UNIX workstations; operating system resource usage; performance degradation; proactive fault management techniques; slope estimation techniques; software aging detection; software aging estimation; software rejuvenation; statistical techniques; statistical trend detection techniques; system activity data; Aging; Application software; Computer errors; Degradation; Electrical capacitance tomography; Hardware; Monitoring; Operating systems; Read only memory; Software safety;
fLanguage
English
Publisher
ieee
Conference_Titel
Software Reliability Engineering, 1998. Proceedings. The Ninth International Symposium on
Conference_Location
Paderborn
ISSN
1071-9458
Print_ISBN
0-8186-8991-9
Type
conf
DOI
10.1109/ISSRE.1998.730892
Filename
730892
Link To Document