• DocumentCode
    2721782
  • Title

    An Ontology-Based Topical Crawling Algorithm for Accessing Deep Web Content

  • Author

    Arya, K.V. ; Vadlamudi, B.R.

  • Author_Institution
    ABV-IIITM, Gwalior, India
  • fYear
    2012
  • fDate
    23-25 Nov. 2012
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Due to the large volume of the Web information and relatively high speed of information update, the coverage and quality of the retrieved pages by modern search engines is comparatively small. Given the volume of the Web and its frequency of content change, the coverage and quality of pages retrieved by modern search engines is relatively small since they crawl only hypertext links ignoring the search forms which are the entry points for accessing deep web content where two-thirds of information is resides. In this paper an algorithm has been designed to enable topical crawlers to access hidden web content by using domain based ontology to determine the forms´ relevance to the domain. In this work scientific research publications domain has been considered. Experimental results show that proposed approach is better as compared to keyword based crawlers in terms of both relevancy and completeness.
  • Keywords
    Internet; information retrieval; ontologies (artificial intelligence); search engines; Web information; deep Web content access; domain based ontology; hypertext links; keyword based crawlers; ontology-based topical crawling algorithm; scientific research publications domain; search engines; topical crawlers; Arrays; Crawlers; Databases; HTML; Ontologies; Search engines; Web pages; Deep web; Domain ontology; Focused crawler; Form processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer and Communication Technology (ICCCT), 2012 Third International Conference on
  • Conference_Location
    Allahabad
  • Print_ISBN
    978-1-4673-3149-4
  • Type

    conf

  • DOI
    10.1109/ICCCT.2012.10
  • Filename
    6394657