Title :
The Design and Implementation of a Spider in Local Network
Author :
Meixia Qu ; Xiue Jiang ; Junfeng Luan ; Xingjian Ren
Author_Institution :
Sch. of Mech., Electr. & Inf. Eng., Shandong Univ. at Weihai, Weihai, China
Abstract :
This paper mainly introduces the principle and method of the search engine, it also gives the design and implementation of the multi-thread concurrent spider based on the local network. This spider adopts the BloomFilter to solve the URL duplicate and thread pool to manage the concurrent threads; it uses the IoC technique in Spring to provide the support of the different file formats such as DOC, PDF, XLS etc which can demonstrate the scalability of the whole application; the spider speeds up the I/O performance by storing the data in the light database. At the end of the paper, we give the comparison and the analysis between the local search engine and general business search engine in the efficiency and performance.
Keywords :
Internet; data structures; document handling; input-output programs; multi-threading; search engines; software performance evaluation; storage management; BloomFilter; DOC; I/O performance; IoC technique; PDF; URL duplicate; XLS; concurrent thread management; data storage; file formats; general business search engine; local network; local search engine; multithread concurrent spider; thread pool; Educational institutions; Indexes; Instruction sets; Scalability; Search engines; Web pages; BloomFilter; Local Network; Search engine; Spider;
Conference_Titel :
Industrial Control and Electronics Engineering (ICICEE), 2012 International Conference on
Conference_Location :
Xi´an
Print_ISBN :
978-1-4673-1450-3
DOI :
10.1109/ICICEE.2012.490