• DocumentCode
    423130
  • Title

    Augmenting overlay trees for failure resiliency

  • Author

    Silber, Jeremy ; Sahu, Sambit ; Sing, J. ; Liu, Zhen

  • Author_Institution
    IBM Thomas J. Watson Res. Center, NY, USA
  • Volume
    3
  • fYear
    2004
  • fDate
    29 Nov.-3 Dec. 2004
  • Firstpage
    1525
  • Abstract
    Overlay trees typically use directed trees as efficient structures for disseminating information, but their single-path structure means that just one node failure results in the disconnection of all descendants, possibly a significant portion of the graph. The addition of extra "backup" links to a directed tree can provide alternate data paths that significantly reduce the number of nodes disconnected when some set of nodes are removed from the graph. We investigate several deterministic and randomized algorithms for adding such backup links to a directed tree and analyze the connectedness of the resulting graphs when nodes in the network fail with some random probability. We present closed-form approximations and simulation measurements for the connectivity of these augmented trees in networks ranging from hundreds to hundreds of thousands of nodes. We also identify and measure the costs of adding backup links, using simulations and real-world measurements from overlays constructed using PlanetLab latency data. We find that, with node failure rates up to 10%, deterministic backup link selection policies offer comparable resiliency to random backup links with significantly lower overhead and resource usage.
  • Keywords
    directed graphs; multicast communication; peer-to-peer computing; redundancy; telecommunication network reliability; telecommunication network topology; trees (mathematics); Internet; alternate data paths; augmenting overlay trees; connectivity; descendant node disconnection; deterministic backup link selection policies; directed tree backup links; graph connectedness; information dissemination structures; multicast overlay networks; network failure resiliency; network node failure probability; node failure rates; random backup links; redundant links; single-path structures; tree topologies; Algorithm design and analysis; Costs; Delay; Extraterrestrial measurements; Failure analysis; Internet; Protocols; Resilience; Switches; Tree graphs;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Global Telecommunications Conference, 2004. GLOBECOM '04. IEEE
  • Print_ISBN
    0-7803-8794-5
  • Type

    conf

  • DOI
    10.1109/GLOCOM.2004.1378238
  • Filename
    1378238