DocumentCode :
3291774
Title :
Design of a Distributed Spiders System Based on Web Service
Author :
Guangli, Li ; Hongbin, Zhang
Author_Institution :
East China Jiaotong Univ., China
fYear :
2009
fDate :
6-7 June 2009
Firstpage :
167
Lastpage :
170
Abstract :
A distributed spiders antitype was designed by Web service based on service-oriented architecture (SOA).This antitype is made up of a server and several clients. The clients are controlled to download a new Web page by the server according to the crawled pages. Moreover, they must manage the to crawl , crawled URL queues and noise URL queue after analyzing it by multi-threads. Furthermore, they keep connection with the server to pass the unknown URL and domain names. The server is made up of the front platform and the background. The front platform controls the clients including the design of load balance policy and real-time monitoring of clients by Microsoft Message Queue (MSMQ). Web service is deployed on the server background which contains the structure of persistent data connection. With the help of this structure, the front platform and the clients can access data by the normative interface. Finally, a lot of experiments were done which show that the distributed spiders system has good robust performance.
Keywords :
Web services; queueing theory; software architecture; Microsoft Message Queue; Web service; crawled URL queues; distributed spiders system; noise URL queue; service-oriented architecture; Application software; Internet; Monitoring; Physics computing; Queueing analysis; Service oriented architecture; Uniform resource locators; Web pages; Web server; Web services;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Mining and Web-based Application, 2009. WMWA '09. Second Pacific-Asia Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-0-7695-3646-0
Type :
conf
DOI :
10.1109/WMWA.2009.15
Filename :
5232493
Link To Document :
بازگشت