• DocumentCode
    3063469
  • Title

    The Crawler of Specific Resources Recognition Based on Multi-thread

  • Author

    Ke, Ming ; Zhang, PengZhou ; Chen, Guowei

  • Author_Institution
    New Media Inst., Commun. Univ. of China, Beijing, China
  • fYear
    2012
  • fDate
    23-26 June 2012
  • Firstpage
    569
  • Lastpage
    572
  • Abstract
    With the development of computer network and widely used of Internet, online information increases in broadband level exponentially, the difficulty and complexity of information retrieval also increase gradually, so the Crawler is developing rapidly. Crawler is a program that can auto collect information from internet. In this paper, we design and implement a multi-thread Crawler for specific resources. This Crawler has features of high accuracy, strong adaptability and high efficiency. Experiment results prove these.
  • Keywords
    Internet; information retrieval; search engines; Internet; broadband level; computer network; information retrieval; multithread Crawler; specific resources recognition; Accuracy; Crawlers; Data mining; Databases; Educational institutions; Instruction sets; Knowledge engineering; URL filtering; crawler; information extraction; multithreads;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Sciences and Optimization (CSO), 2012 Fifth International Joint Conference on
  • Conference_Location
    Harbin
  • Print_ISBN
    978-1-4673-1365-0
  • Type

    conf

  • DOI
    10.1109/CSO.2012.130
  • Filename
    6274791