Title :
A predication-based approach for effective resource, discovery in topical web
Author :
Ma, Liang ; Chen, Qunxiu ; Wang, Jun ; Xu, Guowei ; Cai, Lianhong
Author_Institution :
Dept. of Comput. Sci., Tsinghua Univ., Beijing, China
Abstract :
Due to enormous growth of the World Wide Web in recent years, crawling specific topical portions quickly without having to explore all Web pages has become a new challenge for resource discovery. A new idea is to predicate the URL´s relevance degree to the topic by related properties of the URL, then crawl the URLs with high probability. In this paper, we do further study on the topic resource and introduce some new properties helpful for more effective relevance predication. We also improve the evaluation algorithm and add two rules to adjust the weights of factors dynamically, which lead to better predication precision. These new issues improve the system performance due to higher topic harvest rate and lower sensitivity to various kinds of initial URL seeds.
Keywords :
Web sites; information retrieval; URL relevance degree predication; World Wide Web; effective resource discovery; topic harvest rate; topical Web; Aggregates; Computer science; Couplings; Crawlers; Data mining; Intelligent systems; Search engines; Uniform resource locators; Web pages; Web sites;
Conference_Titel :
TENCON '02. Proceedings. 2002 IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering
Print_ISBN :
0-7803-7490-8
DOI :
10.1109/TENCON.2002.1181214