DocumentCode :
2721782
Title :
An Ontology-Based Topical Crawling Algorithm for Accessing Deep Web Content
Author :
Arya, K.V. ; Vadlamudi, B.R.
Author_Institution :
ABV-IIITM, Gwalior, India
fYear :
2012
fDate :
23-25 Nov. 2012
Firstpage :
1
Lastpage :
6
Abstract :
Due to the large volume of the Web information and relatively high speed of information update, the coverage and quality of the retrieved pages by modern search engines is comparatively small. Given the volume of the Web and its frequency of content change, the coverage and quality of pages retrieved by modern search engines is relatively small since they crawl only hypertext links ignoring the search forms which are the entry points for accessing deep web content where two-thirds of information is resides. In this paper an algorithm has been designed to enable topical crawlers to access hidden web content by using domain based ontology to determine the forms´ relevance to the domain. In this work scientific research publications domain has been considered. Experimental results show that proposed approach is better as compared to keyword based crawlers in terms of both relevancy and completeness.
Keywords :
Internet; information retrieval; ontologies (artificial intelligence); search engines; Web information; deep Web content access; domain based ontology; hypertext links; keyword based crawlers; ontology-based topical crawling algorithm; scientific research publications domain; search engines; topical crawlers; Arrays; Crawlers; Databases; HTML; Ontologies; Search engines; Web pages; Deep web; Domain ontology; Focused crawler; Form processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer and Communication Technology (ICCCT), 2012 Third International Conference on
Conference_Location :
Allahabad
Print_ISBN :
978-1-4673-3149-4
Type :
conf
DOI :
10.1109/ICCCT.2012.10
Filename :
6394657
Link To Document :
بازگشت