DocumentCode
2237013
Title
An Optimized Policy for Automatic Failure Recovery in Microrebootable Distributed Systems
Author
Lu Xu ; Wang Hui-qiang ; Zhao Guo-sheng
Author_Institution
Coll. of Comput. Sci. & Technol., Harbin Eng. Univ., Harbin, China
fYear
2009
fDate
26-28 Dec. 2009
Firstpage
1662
Lastpage
1665
Abstract
To overcome the challenges of recovery polices generation in the presence of inaccurate failure detection, a failure recovery model for microrebootable distributed systems based on discounted Partially Observable Markov Decision Processes is presented in this paper. Thus the reasonable recovery policies are generated by solving the POMDP model. To tackle the problem of computational complexity of exact solution, a value function approximate solution called fast informed bound solution is used for the near-optimal policies. Simulation-based experimental results on a realistic network security situation prediction system demonstrate that the proposed model can be solved effectively, and the resulting policies convincingly outperform others.
Keywords
Markov processes; computer bootstrapping; optimisation; software fault tolerance; automatic failure recovery; failure detection; fast informed bound solution; micro-rebootable distributed systems; partially observable Markov decision processes; recovery policy optimization; value function approximate solution; Aging; Availability; Computational complexity; Computational modeling; Computer networks; Information science; Iterative methods; Neural networks; Predictive models; Software systems;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Science and Engineering (ICISE), 2009 1st International Conference on
Conference_Location
Nanjing
Print_ISBN
978-1-4244-4909-5
Type
conf
DOI
10.1109/ICISE.2009.292
Filename
5455702
Link To Document