• DocumentCode
    1930526
  • Title

    AKSHR: A novel framework for a Domain-specific Hidden Web Crawler

  • Author

    Bhatia, Komal Kumar ; Sharma, A.K. ; Madaan, Rosy

  • Author_Institution
    Dept. of Comput. Eng., YMCA Inst. of Eng., Faridabad, India
  • fYear
    2010
  • fDate
    28-30 Oct. 2010
  • Firstpage
    307
  • Lastpage
    312
  • Abstract
    Existing search engines crawl and index surface web, ignoring hidden web which otherwise contains more than 500 times of information than PIW. In this paper, a Domain-specific Hidden Web Crawler (AKSHR) is being proposed. The framework extracts hidden web pages by accruing benefits of its three unique features: 1) automatic downloading of search interfaces to crawl hidden web databases, 2) identification of semantic mappings between search interface elements by using a novel approach called DSIM (Domain-specific Interface Mapper), and 3) the capability to automatic filling of search interfaces. The effectiveness of proposed framework has been evaluated through experiments using real web sites and encouraging preliminary results were obtained.
  • Keywords
    Web sites; information retrieval; search engines; AKSHR; Web sites; automatic downloading; crawl hidden Web database; domain-specific hidden Web crawler; domain-specific interface mapper; hidden Web pages; index surface Web; search engines crawl; search interfaces; semantic mapping; Crawlers; Data mining; Databases; Filling; Search engines; Semantics; Web pages; Crawling; Hidden Web; search engine; search interfaces; semantic mapping;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Distributed and Grid Computing (PDGC), 2010 1st International Conference on
  • Conference_Location
    Solan
  • Print_ISBN
    978-1-4244-7675-6
  • Type

    conf

  • DOI
    10.1109/PDGC.2010.5679916
  • Filename
    5679916