DocumentCode :
2408002
Title :
Research of vertical search engine in news industry
Author :
Li, Meng ; Gu, X.J. ; Yang, Z.X.
Author_Institution :
Lishui Univ., Lishui, China
fYear :
2012
fDate :
8-9 Nov. 2012
Firstpage :
253
Lastpage :
256
Abstract :
In summing up the existing network of reptiles, and full-text retrieval based on theoretical knowledge, conducted a Web crawler optimization algorithm so that it can adapt to the needs of vertical search engines, and then sub-word component of Pango and Lucene.Net build an efficient full-text search functions. The innovation of the paper is the analysis of the characteristics of news sites to integrate its features into the traditional vertical search engines. News site on the information requirements for the characteristics of the network by studying the relevant full-text search framework to multithreaded data collection and retrieval of the vertical search engine, performance and user experience goals are to achieve abetter. The entire system by small and medium news site commissioning tests designed to meet the test show that the crawlers can adapt to the new network news industry efficient and timely collection requirements, Lucene.Net segmentation. The integration of Pango built the content for news and information Full-text retrieval system can achieve the accuracy of search engine queries for information and efficient response time demands, thereby increasing the amount of information and user experience.
Keywords :
information retrieval; search engines; Lucene; Pango; Web crawler optimization algorithm; full-text retrieval; full-text retrieval system; full-text search framework; multithreaded data collection; theoretical knowledge; vertical search engine; vertical search engines; Lucene; Pangu word; full-text search; vertical search engine;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Management of Technology (ISMOT), 2012 International Symposium on
Conference_Location :
Hangzhou
Type :
conf
DOI :
10.1109/ISMOT.2012.6679470
Filename :
6679470
Link To Document :
بازگشت