• DocumentCode
    3143821
  • Title

    Availability-Based Methods for Distributed Storage Systems

  • Author

    Kermarrec, A. ; Merrer, E.L. ; Straub, Gilles ; Van Kempen, A.

  • fYear
    2012
  • fDate
    8-11 Oct. 2012
  • Firstpage
    151
  • Lastpage
    160
  • Abstract
    Distributed storage systems rely heavily on redundancy to ensure data availability as well as durability. In networked systems subject to intermittent node unavailability, the level of redundancy introduced in the system should be minimized and maintained upon failures. Repairs are well-known to be extremely bandwidth-consuming and it has been shown that, without care, they may significantly congest the system. In this paper, we propose an approach to redundancy management accounting for nodes heterogeneity with respect to availability. We show that by using the availability history of nodes, the performance of two important faces of distributed storage (replica placement and repair) can be significantly improved. Replica placement is achieved based on complementary nodes with respect to nodes availability, improving the overall data availability. Repairs can be scheduled thanks to an adaptive per-node timeout according to node availability, so as to decrease the number of repairs while reaching comparable availability. We propose practical heuristics for those two issues. We evaluate our approach through extensive simulations based on real and well-known availability traces. Results clearly show the benefits of our approach with regards to the critical trade-off between data availability, load-balancing and bandwidth consumption.
  • Keywords
    distributed memory systems; minimisation; performance evaluation; redundancy; reliability theory; replicated databases; resource allocation; storage management; adaptive per-node timeout; availability traces; availability-based methods; bandwidth consumption; data availability; distributed storage performance improvement; distributed storage systems; durability; intermittent node unavailability; load balancing; nodes availability; nodes heterogeneity; redundancy management; redundancy minimization; replica placement; Availability; Bandwidth; History; Load management; Maintenance engineering; Peer to peer computing; Redundancy;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Reliable Distributed Systems (SRDS), 2012 IEEE 31st Symposium on
  • Conference_Location
    Irvine, CA
  • ISSN
    1060-9857
  • Print_ISBN
    978-1-4673-2397-0
  • Type

    conf

  • DOI
    10.1109/SRDS.2012.10
  • Filename
    6424849