Title :
Design and Implementation of a Crawling System in Shopping Search Engine
Author_Institution :
Sch. of Journalism & Commun., Tsinghua Univ., Beijing, China
Abstract :
This paper presents a crawling system, that helps building a shopping search engine. The system´s core is a vertical crawler that is specially used in shopping search engines. The crawler is divided into 5 parts and it realized automatic generation of crawling templates by its analysis module which is a regular expressions method set. A manual generation of crawling template is designed to support the automatic function, and these modified templates 100% passed the test. Five parts´ perfectly combination makes the system being such a fool-proof system that anyone who has basic computer operation ability can use it. With the fast development of search engine, topical crawler that severs vertical search engines has been an important research direction. There are many literatures describing the crawler design algorithm and crawling strategies, however, due to the competitive nature of the shopping search engine business, there are few papers in the literature describing the design and implementation in shopping search engines´ topical crawlers. This paper´s main contribution is to fill that gap. The crawling system described in this paper is a prototype of crawler system in shopping search engine. It will influence the later shopping search engines´ searching and development.
Keywords :
Internet; electronic commerce; online front-ends; search engines; systems analysis; crawling template; fool-proof system; regular expressions method set; shopping search engine; vertical crawling system; Algorithm design and analysis; Automatic testing; Buildings; Computer science; Crawlers; Design engineering; Electronic mail; Java; Prototypes; Search engines; Automatic generation of crawler module; Crawler system; Shopping search engine;
Conference_Titel :
Computer Science and Engineering, 2009. WCSE '09. Second International Workshop on
Conference_Location :
Qingdao
Print_ISBN :
978-0-7695-3881-5
DOI :
10.1109/WCSE.2009.798