DocumentCode :
3013503
Title :
The systematic improvement of fault tolerance in the Rio file cache
Author :
Wee Teck Ng ; Chen, P.M.
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Michigan Univ., MI, USA
fYear :
1999
fDate :
15-18 June 1999
Firstpage :
76
Lastpage :
83
Abstract :
Fault injection is typically used to characterize failures and to validate and compare fault-tolerant mechanisms. However fault injection is rarely used for all these purposes to guide the design and implementation of a fault tolerant system. We present a systematic and quantitative approach for using software-implemented fault injection to guide the design and implementation of a fault-tolerant system. Our system design goal is to build a write-back file cache on Intel PCs that is as reliable as a write-through file cache. We follow an iterative approach to improve robustness in the presence of operating system errors. In each iteration, we measure the reliability of the system, analyze the fault symptoms that lead to data con option, and apply fault-tolerant mechanisms that address the fault symptoms. Our initial system is 13 times less reliable than a write-through file cache. The result of several iterations is a design that is both more reliable (1.9% vs. 3.1% corruption rate) and 5-9 times as fast as a write-through file cache.
Keywords :
cache storage; fault tolerant computing; Rio file cache; fault injection; fault tolerance; fault-tolerant system; operating system errors; robustness; write-back file cache; Computational modeling; Computer crashes; Computer science; Computer simulation; Fault detection; Fault tolerant systems; File systems; Hardware; Operating systems; Personal communication networks;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fault-Tolerant Computing, 1999. Digest of Papers. Twenty-Ninth Annual International Symposium on
Conference_Location :
Madison, WI, USA
ISSN :
0731-3071
Print_ISBN :
0-7695-0213-X
Type :
conf
DOI :
10.1109/FTCS.1999.781036
Filename :
781036
Link To Document :
بازگشت