Author_Institution :
Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL
Abstract :
This paper addresses the problem of selection and discovery of a consistent availability monitoring overlay for computer hosts in a large-scale distributed application, where hosts may be selfish or colluding. We motivate six significant goals for the problem - consistency, verifiability, and randomness, in selecting the availability monitors of nodes, as well as discoverability, load-balancing, and scalability in finding these monitors. We then present a new system, called AVMON, that is the first to satisfy these six requirements. The core algorithmic contribution of this paper is a protocol for discovering the availability monitoring overlay in a scalable and efficient manner, given any arbitrary monitor selection scheme that is consistent and verifiable. We mathematically analyze the performance of AVMON´s discovery protocols, and derive an optimal variant that minimizes memory, bandwidth, computation, and discovery time of monitors. Our experimental evaluations of AVMON use three types of availability traces - synthetic, from PlanetLab, and from a peer-to-peer system (Overnet) - and demonstrate that AVMON works well in a variety of distributed systems.
Keywords :
peer-to-peer computing; reliability; resource allocation; AVMON; computer hosts; consistent availability monitoring overlays; discoverability; distributed systems; load balancing; optimal discovery; peer-to-peer system; scalability; scalable discovery; Application software; Availability; Computer displays; Computerized monitoring; Distributed computing; Large-scale systems; Peer to peer computing; Performance analysis; Protocols; Scalability; Availability; Churn; Consistency; Monitoring; Optimality.; Overlay; Scalability;