DocumentCode
252011
Title
A Task Scheduling Strategy Based on Weighted Round-Robin for Distributed Crawler
Author
Dajie Ge ; ZhiJun Ding
Author_Institution
Dept. of Comput. Sci. & Technol., Tongji Univ. Shanghai, Shanghai, China
fYear
2014
fDate
8-11 Dec. 2014
Firstpage
848
Lastpage
852
Abstract
With the rapid development of the network, stand-alone crawlers have been hard to find and gather the massive information. The form of crawlers will gradually tend to distributed. This paper proposes a task scheduling strategy based on weighted Round-Robin for small-scale distributed crawler, and formula weights for the current node based on crawling efficiency, so that each node can load balance. The design of the error recovery mechanism and the node table allows crawling nodes have flexible scalability and fault tolerance. Finally, we conducted some experiments to prove the good load balancing performance of the system.
Keywords
distributed processing; resource allocation; scheduling; task analysis; distributed crawler; load balancing; rapid development; task scheduling strategy; weighted round-robin; Algorithm design and analysis; Crawlers; Schedules; Scheduling; Scheduling algorithms; Uniform resource locators; crawlers; distributed; scheduling; weighted Round-Robin;
fLanguage
English
Publisher
ieee
Conference_Titel
Utility and Cloud Computing (UCC), 2014 IEEE/ACM 7th International Conference on
Conference_Location
London
Type
conf
DOI
10.1109/UCC.2014.138
Filename
7027605
Link To Document