• DocumentCode
    2349464
  • Title

    A methodology for detection and estimation of software aging

  • Author

    Garg, Sachin ; Van Moorsel, Aad ; Vaidyanathan, Kalyanaraman ; Trivedi, Kishor S.

  • Author_Institution
    AT&T Bell Labs., Murray Hill, NJ, USA
  • fYear
    1998
  • fDate
    4-7 Nov 1998
  • Firstpage
    283
  • Lastpage
    292
  • Abstract
    The phenomenon of software aging refers to the accumulation of errors during the execution of the software which eventually results in it´s crash/hang failure. A gradual performance degradation may also accompany software aging. Pro-active fault management techniques such as “software rejuvenation” (Y. Huang et al., 1995) may be used to counteract aging if it exists. We propose a methodology for detection and estimation of aging in the UNIX operating system. First, we present the design and implementation of an SNMP based, distributed monitoring tool used to collect operating system resource usage and system activity data at regular intervals, from networked UNIX workstations. Statistical trend detection techniques are applied to this data to detect/validate the existence of aging. For quantifying the effect of aging in operating system resources, we propose a metric: “estimated time to exhaustion”, which is calculated using well known slope estimation techniques. Although the distributed data collection tool is specific to UNIX, the statistical techniques can be used for detection and estimation of aging in other software as well
  • Keywords
    Unix; software fault tolerance; software maintenance; system monitoring; SNMP based distributed monitoring tool; UNIX operating system; distributed data collection tool; error accumulation; estimated time to exhaustion; networked UNIX workstations; operating system resource usage; performance degradation; proactive fault management techniques; slope estimation techniques; software aging detection; software aging estimation; software rejuvenation; statistical techniques; statistical trend detection techniques; system activity data; Aging; Application software; Computer errors; Degradation; Electrical capacitance tomography; Hardware; Monitoring; Operating systems; Read only memory; Software safety;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Reliability Engineering, 1998. Proceedings. The Ninth International Symposium on
  • Conference_Location
    Paderborn
  • ISSN
    1071-9458
  • Print_ISBN
    0-8186-8991-9
  • Type

    conf

  • DOI
    10.1109/ISSRE.1998.730892
  • Filename
    730892