• DocumentCode
    31025
  • Title

    Tree-structured parallel regeneration for multiple data losses in distributed storage systems based on erasure codes

  • Author

    Sun Weidong ; Yijie, Wang ; Pei Xiaoqiang

  • Author_Institution
    Nat. Key Lab. for Parallel & Distrib. Process., Nat. Univ. of Defense Technol., Changsha, China
  • Volume
    10
  • Issue
    4
  • fYear
    2013
  • fDate
    Apr-13
  • Firstpage
    113
  • Lastpage
    125
  • Abstract
    To reduce the time required to complete the regeneration process of erasure codes, we propose a Tree-structured Parallel Regeneration (TPR) scheme for multiple data losses in distributed storage systems. Under the scheme, two algorithms are proposed for the construction of multiple regeneration trees, namely the edge-disjoint algorithm and edge-sharing algorithm. The edge-disjoint algorithm constructs multiple independent trees, and is simple and appropriate for environments where newcomers and their providers are distributed over a large area and have few intersections. The edge-sharing algorithm constructs multiple trees that compete to utilize the bandwidth, and make a better utilization of the bandwidth, although it needs to measure the available bandwidth and deal with the bandwidth changes; it is therefore difficult to implement in practical systems. The parallel regeneration for multiple data losses of TPR primarily includes two optimizations: firstly, transferring the data through the bandwidth optimized-paths in a pipeline manner; secondly, executing data regeneration over multiple trees in parallel. To evaluate the proposal, we implement an event-based simulator and make a detailed comparison with some popular regeneration methods. The quantitative comparison results show that the use of TPR employing either the edge-disjoint algorithm or edge-sharing algorithm reduces the regeneration time significantly.
  • Keywords
    parallel programming; storage management; tree data structures; TPR scheme; distributed storage systems; edge-sharing algorithm; erasure codes; multiple data loss; tree-structured parallel regeneration; Algorithm design and analysis; Bandwidth; Data models; Distributed databases; Encoding; Redundancy; distributed storage system; erasure code; regeneration tree; replication;
  • fLanguage
    English
  • Journal_Title
    Communications, China
  • Publisher
    ieee
  • ISSN
    1673-5447
  • Type

    jour

  • DOI
    10.1109/CC.2013.6506936
  • Filename
    6506936