DocumentCode :
3351512
Title :
A New Algorithm of Topical Crawler
Author :
Wei-jiang, Li ; Hua-suo, Ru ; Tie-jun, Zhao ; Wen-mao, Zang
Author_Institution :
Comput. Applic. Key Lab. of Yunnan Province, Kunming Univ. of Sci. & Technol., Kunming, China
Volume :
1
fYear :
2009
fDate :
28-30 Oct. 2009
Firstpage :
443
Lastpage :
446
Abstract :
The generic crawler provides more help to people for finding information in WWW. However, it has some drawback in terms of precision and efficiency because of its generality and no specialty. In this paper, we address two issues of the topical web crawler. One is how to make the definition of the topic; the other is how to sort of links to be downloaded in the queue efficiently. It aims to visit only relevant pages, and get a great scale of hyperlinks which link to the relevant pages. The crawl method in this paper is a novel one, which is based on the semi-structured features of the website and content information. The results of experiment show that it is a very effective method for focused crawler.
Keywords :
social networking (online); WWW finding information; Website semi structured features; hyperlinks great scale; queue efficiently downloaded; topical crawler algorithm; topical web crawler; Cities and towns; Computer applications; Computer science; Crawlers; Databases; Information resources; Laboratories; Search engines; Web pages; World Wide Web; Algorithm; Generic Crawler; Topical Crawler;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science and Engineering, 2009. WCSE '09. Second International Workshop on
Conference_Location :
Qingdao
Print_ISBN :
978-0-7695-3881-5
Type :
conf
DOI :
10.1109/WCSE.2009.706
Filename :
5403244
Link To Document :
بازگشت