• DocumentCode
    1901191
  • Title

    An Architectural Framework of a Crawler for Locating Deep Web Repositories Using Learning Multi-agent Systems

  • Author

    Akilandeswari, J. ; Gopalan, N.P.

  • Author_Institution
    Dept. of CSE, Sona Coll. of Technol., Salem
  • fYear
    2008
  • fDate
    8-13 June 2008
  • Firstpage
    558
  • Lastpage
    562
  • Abstract
    The World Wide Web (WWW) has become one of the largest and most readily accessible repositories of human knowledge. The traditional search engines index only surface Web whose pages are easily found. The focus has now been moved to invisible Web or hidden Web, which consists of large warehouse of useful data such as images, sounds, presentations and many other types of media. To utilize such data, there is a need for specialized program to locate those sites as we do with search engines. This paper discusses about an effective design of a hidden Web crawler that can autonomously discover pages from the hidden Web by employing multi-agent Web mining system. A theoretical framework is suggested to investigate the resource discovery problem and the empirical results suggest substantial improvement in the crawling strategy and harvest rate.
  • Keywords
    Internet; data mining; information retrieval; learning (artificial intelligence); multi-agent systems; search engines; Web crawler architectural framework; World Wide Web; data warehouse; deep Web repository location; hidden Web; invisible Web; learning multiagent systems; multiagent Web mining system; resource discovery; search engine index; site location; Crawlers; Databases; Humans; Information retrieval; Multiagent systems; Search engines; Web mining; Web pages; Web sites; World Wide Web; Web mining; hidden Web crawler; information retrieval; multi-agents; reinforcement learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Internet and Web Applications and Services, 2008. ICIW '08. Third International Conference on
  • Conference_Location
    Athens
  • Print_ISBN
    978-0-7695-3163-2
  • Electronic_ISBN
    978-0-7695-3163-2
  • Type

    conf

  • DOI
    10.1109/ICIW.2008.94
  • Filename
    4545672