DocumentCode :
1930526
Title :
AKSHR: A novel framework for a Domain-specific Hidden Web Crawler
Author :
Bhatia, Komal Kumar ; Sharma, A.K. ; Madaan, Rosy
Author_Institution :
Dept. of Comput. Eng., YMCA Inst. of Eng., Faridabad, India
fYear :
2010
fDate :
28-30 Oct. 2010
Firstpage :
307
Lastpage :
312
Abstract :
Existing search engines crawl and index surface web, ignoring hidden web which otherwise contains more than 500 times of information than PIW. In this paper, a Domain-specific Hidden Web Crawler (AKSHR) is being proposed. The framework extracts hidden web pages by accruing benefits of its three unique features: 1) automatic downloading of search interfaces to crawl hidden web databases, 2) identification of semantic mappings between search interface elements by using a novel approach called DSIM (Domain-specific Interface Mapper), and 3) the capability to automatic filling of search interfaces. The effectiveness of proposed framework has been evaluated through experiments using real web sites and encouraging preliminary results were obtained.
Keywords :
Web sites; information retrieval; search engines; AKSHR; Web sites; automatic downloading; crawl hidden Web database; domain-specific hidden Web crawler; domain-specific interface mapper; hidden Web pages; index surface Web; search engines crawl; search interfaces; semantic mapping; Crawlers; Data mining; Databases; Filling; Search engines; Semantics; Web pages; Crawling; Hidden Web; search engine; search interfaces; semantic mapping;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Distributed and Grid Computing (PDGC), 2010 1st International Conference on
Conference_Location :
Solan
Print_ISBN :
978-1-4244-7675-6
Type :
conf
DOI :
10.1109/PDGC.2010.5679916
Filename :
5679916
Link To Document :
بازگشت