DocumentCode
2923131
Title
A cost-based heterogeneous recovery scheme for distributed storage systems with RAID-6 codes
Author
Zhu, Yunfeng ; Lee, Patrick P C ; Xiang, Liping ; Xu, Yinlong ; Gao, Lingling
Author_Institution
Sch. of Comput. Sci. & Technol., Univ. of Sci. & Technol. of China, Hefei, China
fYear
2012
fDate
25-28 June 2012
Firstpage
1
Lastpage
12
Abstract
Modern distributed storage systems provide large-scale, fault-tolerant data storage. To reduce the probability of data unavailability, it is important to recover the lost data of any failed storage node efficiently. In practice, storage nodes are of heterogeneous types and have different transmission bandwidths. Thus, traditional recovery solutions that simply minimize the number of data blocks being read may no longer be optimal in a heterogeneous environment. We propose a cost-based heterogeneous recovery (CHR) algorithm for RAID-6-coded storage systems. We formulate the recovery problem as an optimization model in which storage nodes are associated with generic costs. We narrow down the solution space of the model to make it practically tractable, while still achieving the global optimal solution in most cases. We implement different recovery algorithms and conduct testbed experiments on a real networked storage system with heterogeneous storage devices. We show that our CHR algorithm reduces the total recovery time of existing recovery solutions in various scenarios.
Keywords
distributed processing; fault tolerance; minimisation; storage management; CHR algorithm; RAID- 6-coded storage systems; cost-based heterogeneous recovery scheme; data block minimization; data loss recovery; data unavailability probability reduction; distributed storage systems; generic costs; global optimal solution; heterogeneous storage devices; large-scale fault-tolerant data storage; networked storage system; optimization model; solution space; storage node failure; testbed experiments; total recovery time reduction; transmission bandwidths; Bandwidth; Cloud computing; Encoding; Measurement; Optimization; Peer to peer computing; Reliability; RAID-6 codes; distributed storage system; experimentation; failure recovery; node heterogeneity;
fLanguage
English
Publisher
ieee
Conference_Titel
Dependable Systems and Networks (DSN), 2012 42nd Annual IEEE/IFIP International Conference on
Conference_Location
Boston, MA
ISSN
1530-0889
Print_ISBN
978-1-4673-1624-8
Electronic_ISBN
1530-0889
Type
conf
DOI
10.1109/DSN.2012.6263934
Filename
6263934
Link To Document