Title :
An Intelligent Topic Web Crawler Based on DTB
Author :
Zhao, Ming-sheng ; Zhu, Peng ; He, Tian-chi
Author_Institution :
Dept. of Inf. Technol., Nanjing Forest Police Coll., Nanjing, China
Abstract :
Web crawling is a fundamental step in many Web applications, such as search engine and data mining. This paper proposes an intelligent topic Web crawler based on DTB (dynamic topic base), which through studying on Web crawlers which filter URLs based on different methods. This Web crawler can update the topic base automatically and improve the accuracy of URL filtering. Experimental results show that the proposed Web crawler can fetch more topic relevant Web pages by crawling less Web space and in less time.
Keywords :
Internet; information filtering; search engines; URL filtering; Web page; data mining; dynamic topic base; intelligent topic Web crawler; search engine; DTB; URL filtering; topic relevancy; topic web crawler;
Conference_Titel :
Web Information Systems and Mining (WISM), 2010 International Conference on
Conference_Location :
Sanya
Print_ISBN :
978-1-4244-8438-6
DOI :
10.1109/WISM.2010.155