• DocumentCode
    2549219
  • Title

    SHelp: Automatic Self-Healing for Multiple Application Instances in a Virtual Machine Environment

  • Author

    Chen, Gang ; Jin, Hai ; Zou, Deqing ; Zhou, Bing Bing ; Qiang, Weizhong ; Hu, Gang

  • Author_Institution
    Cluster & Grid Comput. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China
  • fYear
    2010
  • fDate
    20-24 Sept. 2010
  • Firstpage
    97
  • Lastpage
    106
  • Abstract
    When multiple instances of an application running on multiple virtual machines, an interesting problem is how to utilize the fault handling result from one application instance to heal the same fault occurred on other sibling instances, and hence to ensure high service availability in a cloud computing environment. This paper presents SHelp, a lightweight runtime system that can survive software failures in the framework of virtual machines. It applies weighted rescue points and error virtualization techniques to effectively make applications by-pass the faulty path. A two-level storage hierarchy is adopted in the rescue point database for applications running on different virtual machines to share error handling information to reduce the redundancy and to more effectively and quickly recover from future faults caused by the same bugs. A Linux prototype is implemented and evaluated using four web server applications that contain various types of bugs. Our experimental results show that SHelp can make server applications to recover from these bugs in just a few seconds with modest performance overhead.
  • Keywords
    Internet; Linux; file servers; program debugging; software fault tolerance; virtual machines; Linux prototype; SHelp; Web server application; application instances; automatic self-healing; cloud computing environment; error virtualization techniques; fault handling; lightweight runtime system; rescue point database; software failures; two-level storage hierarchy; virtual machine environment; weighted rescue points; Buffer overflow; Computer architecture; Computer bugs; Databases; Servers; Software; Virtual machining; Dynamic Instrumentation; Fault Recovery; Software Reliability; Software Self-healing; Virtual Machine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing (CLUSTER), 2010 IEEE International Conference on
  • Conference_Location
    Heraklion, Crete
  • Print_ISBN
    978-1-4244-8373-0
  • Electronic_ISBN
    978-0-7695-4220-1
  • Type

    conf

  • DOI
    10.1109/CLUSTER.2010.18
  • Filename
    5600316