• DocumentCode
    3351512
  • Title

    A New Algorithm of Topical Crawler

  • Author

    Wei-jiang, Li ; Hua-suo, Ru ; Tie-jun, Zhao ; Wen-mao, Zang

  • Author_Institution
    Comput. Applic. Key Lab. of Yunnan Province, Kunming Univ. of Sci. & Technol., Kunming, China
  • Volume
    1
  • fYear
    2009
  • fDate
    28-30 Oct. 2009
  • Firstpage
    443
  • Lastpage
    446
  • Abstract
    The generic crawler provides more help to people for finding information in WWW. However, it has some drawback in terms of precision and efficiency because of its generality and no specialty. In this paper, we address two issues of the topical web crawler. One is how to make the definition of the topic; the other is how to sort of links to be downloaded in the queue efficiently. It aims to visit only relevant pages, and get a great scale of hyperlinks which link to the relevant pages. The crawl method in this paper is a novel one, which is based on the semi-structured features of the website and content information. The results of experiment show that it is a very effective method for focused crawler.
  • Keywords
    social networking (online); WWW finding information; Website semi structured features; hyperlinks great scale; queue efficiently downloaded; topical crawler algorithm; topical web crawler; Cities and towns; Computer applications; Computer science; Crawlers; Databases; Information resources; Laboratories; Search engines; Web pages; World Wide Web; Algorithm; Generic Crawler; Topical Crawler;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Engineering, 2009. WCSE '09. Second International Workshop on
  • Conference_Location
    Qingdao
  • Print_ISBN
    978-0-7695-3881-5
  • Type

    conf

  • DOI
    10.1109/WCSE.2009.706
  • Filename
    5403244