DocumentCode :
2177826
Title :
Reliability of Disk Arrays with Double Parity
Author :
Schwarz, S. J. Thomas ; Long, Darrell D. E. ; Paris, Jehan-Francois
Author_Institution :
Univ. Catolica del Uruguay, Montevideo, Uruguay
fYear :
2013
fDate :
2-4 Dec. 2013
Firstpage :
108
Lastpage :
117
Abstract :
We present a general method for estimating the risk of data loss in arbitrary two-dimensional RAID arrays where each data disk belongs to exactly two single-parity stripes. We start by representing each array organization by a graph where each parity stripe, and its associated parity disk, is represented by a node and each data disk by an edge. We then use this representation to identify and enumerate minimal sets of disk failures, say, triple failures, quadruple failures and so forth, that will cause a data loss. The overall probabilities that a given number n of disk failures will cause a data loss is then given by the ratio of the total number of fatal disk failures involving n disks over the total number of possible failures of n disks. To illustrate the power of our method, we apply it to two distinct, archival two-dimensional array organizations. The first, "square" organization is a traditional square layout where data disks are formed into a square and the parity stripes are formed by the rows and columns in the square. Hence a square layout organization with n^2 data disks will have 2n parity disks. The second, "complete" organization corresponds to a closer weave, where all parity stripes intersect and each intersection contains a parity disk. This organization with n parity disks will have n(n - 1)/2 data disks. Our results show that previous ad hoc estimates of the reliability of these arrays significantly underestimated their reliability by assuming that either all triple or all quadruple disk failures were fatal. We show that the two two-dimensional array organizations exhibit mean times to data loss and five-year survival rates that are very similar to those of a RAID Level 6 organization of much smaller capacity. Our complete organization is about 4.5 times and the square organization is about 8 times more reliable than a disk array with same storage capacity built from RAID level 6 stripes. We present a general method for estimating the risk of dat- loss in arbitrary two-dimensional RAID arrays where each data disk belongs to exactly two single-parity stripes. We start by representing each array organization by a graph where each parity stripe, and its associated parity disk, is represented by a node and each data disk by an edge. We then use this representation to identify and enumerate minimal sets of disk failures, say, triple failures, quadruple failures and so forth, that will cause a data loss. The overall probabilities that a given number n of disk failures will cause a data loss is then given by the ratio of the total number of fatal disk failures involving n disks over the total number of possible failures of n disks. To illustrate the power of our method, we apply it to two distinct, archival two-dimensional array organizations. The first, "square" organization is a traditional square layout where data disks are formed into a square and the parity stripes are formed by the rows and columns in the square. Hence a square layout organization with n^2 data disks will have 2n parity disks. The second, "complete" organization corresponds to a closer weave, where all parity stripes intersect and each intersection contains a parity disk. This organization with n parity disks will have n(n - 1)/2 data disks. Our results show that previous ad hoc estimates of the reliability of these arrays significantly underestimated their reliability by assuming that either all triple or all quadruple disk failures were fatal. We show that the two two-dimensional array organizations exhibit mean times to data loss and five-year survival rates that are very similar to those of a RAID Level 6 organization of much smaller capacity. Our complete organization is about 4.5 times and the square organization is about 8 times more reliable than a disk array with same storage capacity built from RAID level 6 stripes.
Keywords :
RAID; graph theory; probability; RAID Level 6 organization; complete organization; data disk; data loss; data loss risk estimation; disk arrays reliability; disk failure; double parity; parity disk; probabilities; redundant arrays of inexpensive disks; single-parity stripes; square layout organization; two-dimensional RAID arrays; two-dimensional array organizations; Arrays; Data models; Layout; Organizations; Probability; Reliability theory; Disk array organization; Markov model; archival storage system; five year survival rate; mean time to data loss;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Dependable Computing (PRDC), 2013 IEEE 19th Pacific Rim International Symposium on
Conference_Location :
Vancouver, BC
Type :
conf
DOI :
10.1109/PRDC.2013.20
Filename :
6820846
Link To Document :
بازگشت