• DocumentCode
    119500
  • Title

    MCRTREE: A Mutually Cooperative Recovery Scheme for Multiple Losses in Distributed Storage Systems Based on Tree Structure

  • Author

    Xiaoqiang Pei ; Yijie Wang ; Xingkong Ma ; Yongquan Fu ; Fangliang Xu

  • Author_Institution
    Sci. & Technol. on Parallel & Distrib. Process. Lab., Nat. Univ. of Defense Technol., Changsha, China
  • fYear
    2014
  • fDate
    6-8 Aug. 2014
  • Firstpage
    158
  • Lastpage
    167
  • Abstract
    To guarantee the reliability of distributed storage systems, erasure coding, as a redundant scheme, has received increasingly attention because it can greatly improve the space efficiency compared with the replica schemes. However, it takes a long time and consumes a lot of network bandwidth for erasure coding to repair the lost data on failed nodes. The state-of-art studies focus on the repairing optimization for the single-node-failure context. Real-world experiments have clearly shown that multi-node failures indeed happen in cloud storage systems. Borrowing single-node repairing techniques to the multi-node setting faces challenges on the efficiency. We propose a mutually cooperative recovery scheme MCRTREE based on the tree structure for multiple node failures. MCRTREE improves the bandwidth utilization and reduces the repair time by the construction of regeneration trees between each new node (denoted as newcomers) and alive nodes (denoted as providers). Further, MCRTREE reduces the size of the data volumes to be transmitted for the repair process. Numerical experiments show that MCRTREE consumes less storage cost and the maintenance bandwidth compared with other redundancy recovery schemes. Trace-driven simulation results reveal that the MCRTREE reduces the regeneration time by 30% - 50%, improves the successful regeneration probability by 10% - 20% and the data availability by 10% - 20% compared with the typical repair schemes.
  • Keywords
    cloud computing; probability; storage management; tree data structures; MCRTREE; cloud storage systems; distributed storage systems; erasure coding; multiple losses; mutually cooperative recovery scheme; redundant scheme; regeneration probability; reliability; single-node-failure context; tree structure; Conferences; Distributed Storage System; Erasure Codes; Regeneration Tree; Replica;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Networking, Architecture, and Storage (NAS), 2014 9th IEEE International Conference on
  • Conference_Location
    Tianjin
  • Type

    conf

  • DOI
    10.1109/NAS.2014.33
  • Filename
    6923176