DocumentCode :
2558520
Title :
Failure Prediction Models for Proactive Fault Tolerance within Storage Systems
Author :
Eckart, Ben ; Chen, Xin ; He, Xubin ; Scott, Stephen L.
Author_Institution :
Tennessee Technol. Univ., TN
fYear :
2008
fDate :
8-10 Sept. 2008
Firstpage :
1
Lastpage :
8
Abstract :
The increasingly large demand for data storage has spurred on the development of systems that rely on the aggregate performance of multiple hard drives. In many of these applications, reliability and availability are of utmost importance. It is therefore necessary to closely scrutinize a complex storage system´s reliability characteristics. In this paper, we use Markov models to rigorously demonstrate the effects that failure prediction has on a system´s mean time to data loss (MTTDL) given a parameterized sensitivity. We devise models for a single hard drive, RAID1, and N+1 type RAID systems. We find that the normal SMART failure prediction system has little impact on the MTTDL, but striking results can be seen when the sensitivity of the predictor reaches 0.5 or more. In past research, machine learning techniques have been proposed to improve SMART, showing that sensitivity levels of 0.5 or more are possible by training on past SMART data alone. The results of our stochastic models show that even with such relatively modest predictive power, these failure prediction algorithms can drastically extend the MTTDL of a data storage system. We feel that these results underscore the importance and need for complex prediction systems when calculating impending hard drive failures.
Keywords :
Markov processes; RAID; disc drives; failure analysis; fault tolerance; hard discs; MTTDL; Markov model; RAID system; SMART failure prediction system; complex data storage system reliability; proactive fault tolerance; stochastic model; Aggregates; Availability; Fault tolerant systems; Machine learning; Machine learning algorithms; Memory; Power system modeling; Predictive models; Reliability; Stochastic systems;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Modeling, Analysis and Simulation of Computers and Telecommunication Systems, 2008. MASCOTS 2008. IEEE International Symposium on
Conference_Location :
Baltimore, MD
ISSN :
1526-7539
Print_ISBN :
978-1-4244-2817-5
Electronic_ISBN :
1526-7539
Type :
conf
DOI :
10.1109/MASCOT.2008.4770560
Filename :
4770560
Link To Document :
بازگشت