DocumentCode :
3155985
Title :
Search-Engine-Oriented Theme Crawler Design
Author :
Dong, Qin
Author_Institution :
Yancheng Inst. of Technol., Yancheng, China
Volume :
2
fYear :
2010
fDate :
12-14 Nov. 2010
Firstpage :
303
Lastpage :
306
Abstract :
A theme crawler is the most important part of a vertical search engine. To recall web pages efficiently and accurately, the design work of theme crawler was studied in this paper. Seed link and similarity measurement are two key techniques for a theme crawler, which are explained in detail in this paper. And the relevant program codes and algorithm were provided to explain there two techniques clearly. The process of a theme crawler begins from fetching seed links, host search engine, interface of search engine and fetch link were illustrated in the paper. To improve the efficiency of crawler, a model of page evaluation was added to the crawler module.
Keywords :
search engines; page evaluation; program codes; theme crawler; vertical search engine; Arrays; Crawlers; Engines; Google; Search engines; Transforms; Web pages; page evaluation; theme crawler; vertical search engine;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
System Science, Engineering Design and Manufacturing Informatization (ICSEM), 2010 International Conference on
Conference_Location :
Yichang
Print_ISBN :
978-1-4244-8664-9
Type :
conf
DOI :
10.1109/ICSEM.2010.169
Filename :
5640213
Link To Document :
بازگشت