DocumentCode :
2550166
Title :
Intelligent crawling based on rough set for web resource discovery
Author :
Hu, LingXia
Author_Institution :
Inf. Eng., Zhong Shan Torch Polytech. Coll., Zhong Shan, China
fYear :
2010
fDate :
16-18 April 2010
Firstpage :
624
Lastpage :
627
Abstract :
The rapid development of the Internet brings a new problem, which is how to rapidly and effectively retrieve needed web resource from vast number of web pages. The progress of machine learning techniques shows a new direction of solving this problem. In this paper, intelligent crawling algorithm based on rough set is proposed. The algorithm use the hypertext features behavior in order to perform topic specific resource discovery. Our experiment in this regard has provided better Harvest rate and better Target recall for focused crawling.
Keywords :
Internet; hypermedia; information retrieval; learning (artificial intelligence); rough set theory; Web resource discovery; hypertext features behavior; intelligent crawling algorithm; machine learning; rough set; Crawlers; Educational institutions; Information retrieval; Internet; Machine learning algorithms; Search engines; Software libraries; Taxonomy; Uniform resource locators; Web pages; Classification; Intelligent Crawling; Web Resource Discovery;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Management and Engineering (ICIME), 2010 The 2nd IEEE International Conference on
Conference_Location :
Chengdu
Print_ISBN :
978-1-4244-5263-7
Electronic_ISBN :
978-1-4244-5265-1
Type :
conf
DOI :
10.1109/ICIME.2010.5477905
Filename :
5477905
Link To Document :
بازگشت