• DocumentCode
    2353722
  • Title

    Improving Log-based Field Failure Data Analysis of multi-node computing systems

  • Author

    Pecchia, Antonio ; Cotroneo, Domenico ; Kalbarczyk, Zbigniew ; Iyer, Ravishankar K.

  • Author_Institution
    Dipt. di Inf. e Sist., Univ. degli Studi di Napoli Federico II, Naples, Italy
  • fYear
    2011
  • fDate
    27-30 June 2011
  • Firstpage
    97
  • Lastpage
    108
  • Abstract
    Log-based Field Failure Data Analysis (FFDA) is a widely-adopted methodology to assess dependability properties of an operational system. A key step in FFDA is filtering out entries that are not useful and redundant error entries from the log. The latter is challenging: a fault, once triggered, can generate multiple errors that propagate within the system. Grouping the error entries related to the same fault manifestation is crucial to obtain realistic measurements. This paper deals with the issues of the tuple heuristic, used to group the error entries in the log, in multi-node computing systems. We demonstrate that the tuple heuristic can group entries incorrectly; thus, an improved heuristic that adopts statistical indicators is proposed. We assess the impact of inaccurate grouping on dependability measurements by comparing the results obtained with both the heuristics. The analysis encompasses the log of the Mercury cluster at the National Center for Supercomputing Applications.
  • Keywords
    data analysis; statistical analysis; National Center for Supercomputing Application; data analysis; log-based field failure data; multinode computing system; statistical indicator; tuple heuristic issue; Context; Data analysis; Failure analysis; Joints; Nickel; Operating systems; Sensitivity analysis; Field Failure Data Analysis; collision; dependability measurements; supercomputer; tuple heuristic;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Dependable Systems & Networks (DSN), 2011 IEEE/IFIP 41st International Conference on
  • Conference_Location
    Hong Kong
  • ISSN
    1530-0889
  • Print_ISBN
    978-1-4244-9232-9
  • Electronic_ISBN
    1530-0889
  • Type

    conf

  • DOI
    10.1109/DSN.2011.5958210
  • Filename
    5958210