• DocumentCode
    3008542
  • Title

    AVMON: Optimal and Scalable Discovery of Consistent Availability Monitoring Overlays for Distributed Systems

  • Author

    Morales, Ramsés ; Gupta, Indranil

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL
  • fYear
    2007
  • fDate
    25-27 June 2007
  • Firstpage
    55
  • Lastpage
    55
  • Abstract
    This paper addresses the problem of selection and discovery of a consistent availability monitoring overlay for computer hosts in a large-scale distributed application, where hosts may be selfish or colluding. We motivate six significant goals for the problem - consistency, verifiability, and randomness, in selecting the availability monitors of nodes, as well as discoverability, load-balancing, and scalability in finding these monitors. We then present a new system, called AVMON, that is the first to satisfy these six requirements. The core algorithmic contribution of this paper is a protocol for discovering the availability monitoring overlay in a scalable and efficient manner, given any arbitrary monitor selection scheme that is consistent and verifiable. We mathematically analyze the performance of AVMON´s discovery protocols, and derive an optimal variant that minimizes memory, bandwidth, computation, and discovery time of monitors. Our experimental evaluations of AVMON use three types of availability traces - synthetic, from PlanetLab, and from a peer-to-peer system (Overnet) - and demonstrate that AVMON works well in a variety of distributed systems.
  • Keywords
    peer-to-peer computing; reliability; resource allocation; AVMON; computer hosts; consistent availability monitoring overlays; discoverability; distributed systems; load balancing; optimal discovery; peer-to-peer system; scalability; scalable discovery; Application software; Availability; Computer displays; Computerized monitoring; Distributed computing; Large-scale systems; Peer to peer computing; Performance analysis; Protocols; Scalability; Availability; Churn; Consistency; Monitoring; Optimality.; Overlay; Scalability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Distributed Computing Systems, 2007. ICDCS '07. 27th International Conference on
  • Conference_Location
    Toronto, ON
  • ISSN
    1063-6927
  • Print_ISBN
    0-7695-2837-3
  • Electronic_ISBN
    1063-6927
  • Type

    conf

  • DOI
    10.1109/ICDCS.2007.87
  • Filename
    4268208