DocumentCode
3351512
Title
A New Algorithm of Topical Crawler
Author
Wei-jiang, Li ; Hua-suo, Ru ; Tie-jun, Zhao ; Wen-mao, Zang
Author_Institution
Comput. Applic. Key Lab. of Yunnan Province, Kunming Univ. of Sci. & Technol., Kunming, China
Volume
1
fYear
2009
fDate
28-30 Oct. 2009
Firstpage
443
Lastpage
446
Abstract
The generic crawler provides more help to people for finding information in WWW. However, it has some drawback in terms of precision and efficiency because of its generality and no specialty. In this paper, we address two issues of the topical web crawler. One is how to make the definition of the topic; the other is how to sort of links to be downloaded in the queue efficiently. It aims to visit only relevant pages, and get a great scale of hyperlinks which link to the relevant pages. The crawl method in this paper is a novel one, which is based on the semi-structured features of the website and content information. The results of experiment show that it is a very effective method for focused crawler.
Keywords
social networking (online); WWW finding information; Website semi structured features; hyperlinks great scale; queue efficiently downloaded; topical crawler algorithm; topical web crawler; Cities and towns; Computer applications; Computer science; Crawlers; Databases; Information resources; Laboratories; Search engines; Web pages; World Wide Web; Algorithm; Generic Crawler; Topical Crawler;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science and Engineering, 2009. WCSE '09. Second International Workshop on
Conference_Location
Qingdao
Print_ISBN
978-0-7695-3881-5
Type
conf
DOI
10.1109/WCSE.2009.706
Filename
5403244
Link To Document