• DocumentCode
    2237013
  • Title

    An Optimized Policy for Automatic Failure Recovery in Microrebootable Distributed Systems

  • Author

    Lu Xu ; Wang Hui-qiang ; Zhao Guo-sheng

  • Author_Institution
    Coll. of Comput. Sci. & Technol., Harbin Eng. Univ., Harbin, China
  • fYear
    2009
  • fDate
    26-28 Dec. 2009
  • Firstpage
    1662
  • Lastpage
    1665
  • Abstract
    To overcome the challenges of recovery polices generation in the presence of inaccurate failure detection, a failure recovery model for microrebootable distributed systems based on discounted Partially Observable Markov Decision Processes is presented in this paper. Thus the reasonable recovery policies are generated by solving the POMDP model. To tackle the problem of computational complexity of exact solution, a value function approximate solution called fast informed bound solution is used for the near-optimal policies. Simulation-based experimental results on a realistic network security situation prediction system demonstrate that the proposed model can be solved effectively, and the resulting policies convincingly outperform others.
  • Keywords
    Markov processes; computer bootstrapping; optimisation; software fault tolerance; automatic failure recovery; failure detection; fast informed bound solution; micro-rebootable distributed systems; partially observable Markov decision processes; recovery policy optimization; value function approximate solution; Aging; Availability; Computational complexity; Computational modeling; Computer networks; Information science; Iterative methods; Neural networks; Predictive models; Software systems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Science and Engineering (ICISE), 2009 1st International Conference on
  • Conference_Location
    Nanjing
  • Print_ISBN
    978-1-4244-4909-5
  • Type

    conf

  • DOI
    10.1109/ICISE.2009.292
  • Filename
    5455702