Title :
The Crawler of Specific Resources Recognition Based on Multi-thread
Author :
Ke, Ming ; Zhang, PengZhou ; Chen, Guowei
Author_Institution :
New Media Inst., Commun. Univ. of China, Beijing, China
Abstract :
With the development of computer network and widely used of Internet, online information increases in broadband level exponentially, the difficulty and complexity of information retrieval also increase gradually, so the Crawler is developing rapidly. Crawler is a program that can auto collect information from internet. In this paper, we design and implement a multi-thread Crawler for specific resources. This Crawler has features of high accuracy, strong adaptability and high efficiency. Experiment results prove these.
Keywords :
Internet; information retrieval; search engines; Internet; broadband level; computer network; information retrieval; multithread Crawler; specific resources recognition; Accuracy; Crawlers; Data mining; Databases; Educational institutions; Instruction sets; Knowledge engineering; URL filtering; crawler; information extraction; multithreads;
Conference_Titel :
Computational Sciences and Optimization (CSO), 2012 Fifth International Joint Conference on
Conference_Location :
Harbin
Print_ISBN :
978-1-4673-1365-0
DOI :
10.1109/CSO.2012.130