• DocumentCode
    1801278
  • Title

    Assessment and Improvement of Hang Detection in the Linux Operating System

  • Author

    Cotroneo, Domenico ; Natella, Roberto ; Russo, Stefano

  • Author_Institution
    Dipt. di Inf. e Sist., Univ. degli Studi di Napoli Federico II, Naples, Italy
  • fYear
    2009
  • fDate
    27-30 Sept. 2009
  • Firstpage
    288
  • Lastpage
    294
  • Abstract
    We propose a fault injection framework to assess hang detection facilities within the Linux operating system (OS). The novelty of the framework consists in the adoption of a more representative fault load than existing ones, and in the effectiveness in terms of number of hang failures produced; representativeness is supported by a field data study on the Linux OS. Using the proposed fault injection framework, along with realistic workloads, we find that the Linux OS is unable to detect hangs in several cases. We experience a relative coverage of 75%. To improve detection facilities, we propose a simple yet effective hang detector, which periodically tests OS liveness, as perceived by applications, by means of I/O system calls; it is shown that this approach can improve relative coverage up to 94%. The hang detector can be deployed on any Linux system, with an acceptable overhead.
  • Keywords
    Linux; Linux; hang detection; operating system; Application software; Detectors; Face detection; Fault detection; Fault tolerance; Hardware; Linux; Operating systems; Software testing; System testing; Autonomic Systems; Fault Injection; Hang Detection; Linux OS;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Reliable Distributed Systems, 2009. SRDS '09. 28th IEEE International Symposium on
  • Conference_Location
    Niagara Falls, NY
  • ISSN
    1060-9857
  • Print_ISBN
    978-0-7695-3826-6
  • Type

    conf

  • DOI
    10.1109/SRDS.2009.26
  • Filename
    5283184