DocumentCode :
1925855
Title :
Web robot detection techniques based on statistics of their requested URL resources
Author :
Guo, Weigang ; Ju, Shiguang ; Gu, Yi
Author_Institution :
Inf. Center, Foshan Univ., Guangdong, China
Volume :
1
fYear :
2005
fDate :
24-26 May 2005
Firstpage :
302
Abstract :
Following the widely use of search engines, the impact Web robots have on the Web sites should not be ignored. After analyzing the navigational patterns of Web robots from Web logs, two new algorithms are proposed. One is based on classification and statistics of requested URL resources, which classifies the URL resources into eight types and counts the number of session of the clients and number of visiting records with same type. And another is based on Web page member list, which constructs one member list for every Web page and one show table for every visitor. The experiment shows that the two new algorithms can detect the unknown robots and unfriendly robots who do not obey the standard for robot exclusion.
Keywords :
Internet; Web sites; search engines; statistical analysis; URL resource; Web logs; Web page; Web robot detection; Web sites; search engine; statistical analysis; Algorithm design and analysis; Humans; Pattern analysis; Robotics and automation; Robots; Search engines; Statistical analysis; Statistics; Uniform resource locators; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Supported Cooperative Work in Design, 2005. Proceedings of the Ninth International Conference on
Print_ISBN :
1-84600-002-5
Type :
conf
DOI :
10.1109/CSCWD.2005.194187
Filename :
1504093
Link To Document :
بازگشت