• DocumentCode
    1855809
  • Title

    Achieving reliability growth on real-time systems

  • Author

    Lane, Christopher A. ; Morrison, Joseph D.

  • Author_Institution
    IBM Corp., Rockville, MD, USA
  • fYear
    1994
  • fDate
    24-27Jan 1994
  • Firstpage
    136
  • Lastpage
    141
  • Abstract
    This paper addresses the principles used to predict and attain reliability growth on real-time systems. System reliability modeling techniques that include software reliability, maintenance effectiveness, and failure recovery are discussed in detail. Several software reliability growth models are discussed with emphasis on measured reliability growth of fielded software. The impact of maintenance effectiveness, which is a measure of the maintainer´s skill and training levels, is shown. The need to develop and measure the robustness of failure recovery algorithms is emphasized in this paper. All of these factors are combined with the failure and repair characteristics of hardware to create comprehensive reliability growth models for real-time systems. Through the authors´ research, they have determined that effective failure recovery algorithms are the key to attaining highly reliable systems. Without them, redundant computer systems that run banking and air traffic control systems will come crashing down with possibly disastrous results. The modeling and measurement techniques discussed in this paper provide the reliability practitioner with the methods to predict and achieve reliability growth resulting from improved software reliability and recovery algorithms. A fault tolerant system´s ability to recover from hardware and software failures is gauged by a parameter called coverage. Coverage is the conditional probability of recovery given that a failure has occurred. Because of its huge impact on system reliability, the measurement of coverage is emphasized
  • Keywords
    Markov processes; fault tolerant computing; real-time systems; reliability theory; software maintenance; software reliability; system recovery; conditional probability of recovery; coverage; failure recovery; fault tolerant system; maintenance effectiveness; real-time systems; reliability growth; reliability modeling techniques; robustness; software reliability; Air traffic control; Banking; Computer crashes; Hardware; Predictive models; Real time systems; Robustness; Software maintenance; Software measurement; Software reliability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Reliability and Maintainability Symposium, 1994. Proceedings., Annual
  • Conference_Location
    Anaheim, CA
  • Print_ISBN
    0-7803-1786-6
  • Type

    conf

  • DOI
    10.1109/RAMS.1994.291096
  • Filename
    291096