DocumentCode
3597041
Title
Adaptive focused crawler based on tunneling and link analysis
Author
Zhang, Xiaoming ; Li, Zhoujun ; Hu, Chaojian
Author_Institution
Sch. of Comput. Sci. & Eng., Beihang Univ., Beijing
Volume
3
fYear
2009
Firstpage
2225
Lastpage
2230
Abstract
At present, using focused crawler becomes a way to seek the needed information. The main characteristic of a focused web crawler is to select and retrieve only relevant web pages in each crawling process. In this paper, we propose a learnable algorithm that combines link analysis with web content in order to retrieve specific web documents, and it can predict the next URL through learning. The algorithm also uses an adaptive tunneling to overcome some of the limitations of normal focused crawlers. We apply three metrics to compare its efficiency with other well-known Web crawling techniques based.
Keywords
Internet; information retrieval; information retrieval systems; Web content; Web document retrieval; adaptive focused Web crawler; learnable algorithm; link analysis; tunneling analysis; Algorithm design and analysis; Chaos; Computer science; Content based retrieval; Crawlers; Information analysis; Testing; Tunneling; Uniform resource locators; Web pages;
fLanguage
English
Publisher
ieee
Conference_Titel
Advanced Communication Technology, 2009. ICACT 2009. 11th International Conference on
ISSN
1738-9445
Print_ISBN
978-89-5519-138-7
Electronic_ISBN
1738-9445
Type
conf
Filename
4809522
Link To Document