• DocumentCode
    252011
  • Title

    A Task Scheduling Strategy Based on Weighted Round-Robin for Distributed Crawler

  • Author

    Dajie Ge ; ZhiJun Ding

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Tongji Univ. Shanghai, Shanghai, China
  • fYear
    2014
  • fDate
    8-11 Dec. 2014
  • Firstpage
    848
  • Lastpage
    852
  • Abstract
    With the rapid development of the network, stand-alone crawlers have been hard to find and gather the massive information. The form of crawlers will gradually tend to distributed. This paper proposes a task scheduling strategy based on weighted Round-Robin for small-scale distributed crawler, and formula weights for the current node based on crawling efficiency, so that each node can load balance. The design of the error recovery mechanism and the node table allows crawling nodes have flexible scalability and fault tolerance. Finally, we conducted some experiments to prove the good load balancing performance of the system.
  • Keywords
    distributed processing; resource allocation; scheduling; task analysis; distributed crawler; load balancing; rapid development; task scheduling strategy; weighted round-robin; Algorithm design and analysis; Crawlers; Schedules; Scheduling; Scheduling algorithms; Uniform resource locators; crawlers; distributed; scheduling; weighted Round-Robin;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Utility and Cloud Computing (UCC), 2014 IEEE/ACM 7th International Conference on
  • Conference_Location
    London
  • Type

    conf

  • DOI
    10.1109/UCC.2014.138
  • Filename
    7027605