DocumentCode
88027
Title
Boosting Degraded Reads in Heterogeneous Erasure-Coded Storage Systems
Author
Yunfeng Zhu ; Jian Lin ; Lee, Patrick P. C. ; Yinlong Xu
Author_Institution
AnHui Province Key Lab. of High Performance Comput., Univ. of Sci. & Technol. of China, Hefei, China
Volume
64
Issue
8
fYear
2015
fDate
Aug. 1 2015
Firstpage
2145
Lastpage
2157
Abstract
Distributed storage systems provide large-scale data storage services, yet they are confronted with frequent node failures. To ensure data availability, a storage system often introduces data redundancy via replication or erasure coding. As erasure coding incurs significantly less redundancy overhead than replication under the same fault tolerance, it has been increasingly adopted in large-scale storage systems. In erasure-coded storage systems, degraded reads to temporarily unavailable data are very common, and hence boosting the performance of degraded reads becomes important. One challenge is that storage nodes tend to be heterogeneous with different storage capacities and I/O bandwidths. To this end, we propose FastDR, a system that addresses node heterogeneity and exploits I/O parallelism, so as to boost the performance of degraded reads to temporarily unavailable data. FastDR incorporates a greedy algorithm that seeks to reduce the data transfer cost of reading surviving data for degraded reads, while allowing the search of the efficient degraded read solution to be completed in a timely manner. We implement a FastDR prototype, and conduct extensive evaluation through simulation studies as well as testbed experiments on a Hadoop cluster with 10 storage nodes. We demonstrate that our FastDR achieves efficient degraded reads compared to existing approaches.
Keywords
digital storage; electronic data interchange; encoding; FastDR; Hadoop cluster; data redundancy; data replication; data transfer; degraded read boosting; distributed storage systems; erasure coding; frequent node failures; heterogeneous erasure-coded storage systems; large-scale data storage services; large-scale storage systems; Bandwidth; Decoding; Encoding; Equations; Optimization; Parallel processing; Redundancy; Erasure-coded storage system; I/O parallelism; degraded reads; node heterogeneity;
fLanguage
English
Journal_Title
Computers, IEEE Transactions on
Publisher
ieee
ISSN
0018-9340
Type
jour
DOI
10.1109/TC.2014.2360543
Filename
6911949
Link To Document