Title :
Web robot detection techniques based on statistics of their requested URL resources
Author :
Guo, Weigang ; Ju, Shiguang ; Gu, Yi
Author_Institution :
Inf. Center, Foshan Univ., Guangdong, China
Abstract :
Following the widely use of search engines, the impact Web robots have on the Web sites should not be ignored. After analyzing the navigational patterns of Web robots from Web logs, two new algorithms are proposed. One is based on classification and statistics of requested URL resources, which classifies the URL resources into eight types and counts the number of session of the clients and number of visiting records with same type. And another is based on Web page member list, which constructs one member list for every Web page and one show table for every visitor. The experiment shows that the two new algorithms can detect the unknown robots and unfriendly robots who do not obey the standard for robot exclusion.
Keywords :
Internet; Web sites; search engines; statistical analysis; URL resource; Web logs; Web page; Web robot detection; Web sites; search engine; statistical analysis; Algorithm design and analysis; Humans; Pattern analysis; Robotics and automation; Robots; Search engines; Statistical analysis; Statistics; Uniform resource locators; Web pages;
Conference_Titel :
Computer Supported Cooperative Work in Design, 2005. Proceedings of the Ninth International Conference on
Print_ISBN :
1-84600-002-5
DOI :
10.1109/CSCWD.2005.194187