DocumentCode :
1588355
Title :
HAWK: A Focused Crawler with Content and Link Analysis
Author :
Chen, Xiaoyun ; Zhang, Xin
Author_Institution :
Sch. of Inf. Sci. & Eng., Lanzhou Univ., Lanzhou
fYear :
2008
Firstpage :
677
Lastpage :
680
Abstract :
Maintaining currency of search engine indices by exhaustive crawling is rapidly becoming impossible due to the increasing size of the web. Focused crawlers aim to search only the subset of the web related to a specific topic, and offer a potential solution to the problem. But it also has problems. The major problem is how to retrieve the maximal set of relevant and quality pages. To address this problem we design a focused crawler (we call it HAWK) that not only uses content of web page to improve page relevance, but also uses link structure to improve the coverage of a specific topic.
Keywords :
Web sites; search engines; HAWK; content analysis; focused crawler; link analysis; search engine; web page; Crawlers; Information analysis; Information science; Maintenance engineering; Marine animals; Partial response channels; Search engines; Uniform resource locators; Web pages; Web server; content; focused crawler; link structure; search engine;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
e-Business Engineering, 2008. ICEBE '08. IEEE International Conference on
Conference_Location :
Xi´an
Print_ISBN :
978-0-7695-3395-7
Type :
conf
DOI :
10.1109/ICEBE.2008.46
Filename :
4690687
Link To Document :
بازگشت