• DocumentCode
    1515469
  • Title

    Automatic recognition of intermittent failures: an experimental study of field data

  • Author

    Iyer, Ravishankar K. ; Young, Luke T. ; Iyer, P. V Krishna

  • Author_Institution
    Coordinated Sci. Lab., Illinois Univ., Urbana, IL, USA
  • Volume
    39
  • Issue
    4
  • fYear
    1990
  • fDate
    4/1/1990 12:00:00 AM
  • Firstpage
    525
  • Lastpage
    537
  • Abstract
    A methodology is proposed for recognizing the symptoms of persistent problems in large systems. The system error rate is used to identify the error states among which relationships may exist. Statistical techniques are used to validate and quantify the strength of the relationship among these error states. As input, the approach takes the raw error logs containing a single entry for each error that is detected as an isolated event. As output, it produces a list of symptoms that characterize persistent errors. Thus, given a failure, it is determined whether the failure is an intermittent manifestation of a common fault or whether it is an isolated (transient) incident. The technique is shown to work on two CYBER systems and on IBM 3081 multiprocessor system. Comparisons to real failure/repair information obtained from field engineers show that, in about 85% of the cases, the error symptoms recognized by this approach correspond to real problems. The remaining 15% of the cases, although not directly supported by field data, are confirmed as being valid problems
  • Keywords
    software reliability; CYBER systems; IBM 3081 multiprocessor system; automatic recognition; error rate; intermittent failures; raw error logs; statistical techniques; Artificial intelligence; Error analysis; Event detection; Manufacturing; Marine vehicles; Military computing; Multiprocessing systems; NASA; Operating systems; Statistical analysis;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/12.54845
  • Filename
    54845