DocumentCode
1930526
Title
AKSHR: A novel framework for a Domain-specific Hidden Web Crawler
Author
Bhatia, Komal Kumar ; Sharma, A.K. ; Madaan, Rosy
Author_Institution
Dept. of Comput. Eng., YMCA Inst. of Eng., Faridabad, India
fYear
2010
fDate
28-30 Oct. 2010
Firstpage
307
Lastpage
312
Abstract
Existing search engines crawl and index surface web, ignoring hidden web which otherwise contains more than 500 times of information than PIW. In this paper, a Domain-specific Hidden Web Crawler (AKSHR) is being proposed. The framework extracts hidden web pages by accruing benefits of its three unique features: 1) automatic downloading of search interfaces to crawl hidden web databases, 2) identification of semantic mappings between search interface elements by using a novel approach called DSIM (Domain-specific Interface Mapper), and 3) the capability to automatic filling of search interfaces. The effectiveness of proposed framework has been evaluated through experiments using real web sites and encouraging preliminary results were obtained.
Keywords
Web sites; information retrieval; search engines; AKSHR; Web sites; automatic downloading; crawl hidden Web database; domain-specific hidden Web crawler; domain-specific interface mapper; hidden Web pages; index surface Web; search engines crawl; search interfaces; semantic mapping; Crawlers; Data mining; Databases; Filling; Search engines; Semantics; Web pages; Crawling; Hidden Web; search engine; search interfaces; semantic mapping;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel Distributed and Grid Computing (PDGC), 2010 1st International Conference on
Conference_Location
Solan
Print_ISBN
978-1-4244-7675-6
Type
conf
DOI
10.1109/PDGC.2010.5679916
Filename
5679916
Link To Document