• DocumentCode
    1236792
  • Title

    AVMON: Optimal and Scalable Discovery of Consistent Availability Monitoring Overlays for Distributed Systems

  • Author

    Morales, Ramsés ; Gupta, Indranil

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL
  • Volume
    20
  • Issue
    4
  • fYear
    2009
  • fDate
    4/1/2009 12:00:00 AM
  • Firstpage
    446
  • Lastpage
    459
  • Abstract
    This paper proposes to build overlays that help in monitoring of long-term availability histories of hosts, with a focus on large-scale distributed settings where hosts may be selfish or colluding. Concretely, we target the important problems of selection and discovery of an availability monitoring overlay. We motivate six significant goals - firstly, consistency, verifiability, and randomness, in selecting availability monitors of nodes, so as to be probabilistically resilient to selfish and colluding nodes. The next three goals are discoverability, load-balancing, and scalability in finding these monitors. We present AVMON, the first availability monitoring overlay to satisfy these six requirements. Our core algorithmic contribution is a range of protocols for discovering the availability monitoring overlay scalably and efficiently, given any arbitrary monitor selection scheme that is consistent and verifiable. We mathematically analyze the performance of AVMON´s discovery protocols w.r.t. scalability and discovery time of monitors. Most interestingly, we are able to derive optimal (and practical) variants of AVMON, that minimize different combinations of memory, bandwidth, computation, and monitor discovery time. Finally, our extensive experimental evaluations using three types of availability traces - synthetic, from PlanetLab, and from the Overnet p2p system - demonstrate AVMON´s practicality in a variety of distributed systems.
  • Keywords
    distributed processing; protocols; resource allocation; AVMON; Overnet p2p system; PlanetLab; consistent availability monitoring overlays; discovery protocols; distributed systems; large-scale distributed settings; load-balancing; optimal discovery; scalable discovery; Availability; Churn; Consistency; Distributed Systems; Monitoring; Optimality; Overlay; Scalability;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/TPDS.2008.84
  • Filename
    4531737