Title :
Multicasting a changing repository
Author :
Lam, Wang ; Garcia-Molina, Hector
Author_Institution :
Dept. of Comput. Sci., Stanford Univ., CA, USA
Abstract :
Web crawlers generate significant loads on Web servers, and are difficult to operate. Instead of repeatedly running crawlers at many "client" sites, we propose a central crawler and Web repository that multicasts appropriate subsets of the central repository, and their subsequent changes, to subscribing clients. Loads at Web servers are reduced because a single crawler visits the servers, as opposed to all the client crawlers. We model and evaluate such a central Web multicast facility for subscriber clients, and for mixes of subscriber and one-time down-loader clients. We consider different performance metrics and multicast algorithms for such a multicast facility, and develop guidelines for its design under various conditions.
Keywords :
Internet; Web design; client-server systems; multicast communication; Web crawlers; Web repository; Web servers; central Web multicast facility; central crawler; central repository; client crawlers; multicast algorithms; one-time down-loader clients; subscriber clients; Algorithm design and analysis; Computer science; Crawlers; Guidelines; Measurement; Multicast algorithms; Network servers; Web pages; Web server; Web sites;
Conference_Titel :
Data Engineering, 2003. Proceedings. 19th International Conference on
Print_ISBN :
0-7803-7665-X
DOI :
10.1109/ICDE.2003.1260794