• DocumentCode
    3597041
  • Title

    Adaptive focused crawler based on tunneling and link analysis

  • Author

    Zhang, Xiaoming ; Li, Zhoujun ; Hu, Chaojian

  • Author_Institution
    Sch. of Comput. Sci. & Eng., Beihang Univ., Beijing
  • Volume
    3
  • fYear
    2009
  • Firstpage
    2225
  • Lastpage
    2230
  • Abstract
    At present, using focused crawler becomes a way to seek the needed information. The main characteristic of a focused web crawler is to select and retrieve only relevant web pages in each crawling process. In this paper, we propose a learnable algorithm that combines link analysis with web content in order to retrieve specific web documents, and it can predict the next URL through learning. The algorithm also uses an adaptive tunneling to overcome some of the limitations of normal focused crawlers. We apply three metrics to compare its efficiency with other well-known Web crawling techniques based.
  • Keywords
    Internet; information retrieval; information retrieval systems; Web content; Web document retrieval; adaptive focused Web crawler; learnable algorithm; link analysis; tunneling analysis; Algorithm design and analysis; Chaos; Computer science; Content based retrieval; Crawlers; Information analysis; Testing; Tunneling; Uniform resource locators; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Communication Technology, 2009. ICACT 2009. 11th International Conference on
  • ISSN
    1738-9445
  • Print_ISBN
    978-89-5519-138-7
  • Electronic_ISBN
    1738-9445
  • Type

    conf

  • Filename
    4809522