DocumentCode
2170187
Title
A distributed search engine based on a re-ranking algorithm model
Author
Wan, Jingyong ; Wang, Beizhan ; Guo, Wei ; Chen, Kang ; Wang, Jiajun
Author_Institution
Software School of Xiamen University, Xiamen, China
fYear
2015
fDate
22-24 July 2015
Firstpage
640
Lastpage
644
Abstract
With the rapid increase of websites and the explosive growth of Internet information, the centralized search engine will face great challenge in mass data processing and mass data storage. However, the distributed search engine can solve the problem effectively. In this paper, we describe the design and implementation of a distributed search engine that is based on Apache Nutch, Solr and Hadoop. Considering users click logs, we propose a re-ranking algorithm based on Lucene scoring. Our experimental results show that our approaches significantly satisfy users´ massive data searching demand while maintaining high reliability and scalability.
Keywords
Algorithm design and analysis; Indexing; Search engines; Servers; Software; Web pages; Distributed Search Engine; Hadoop; Re-ranking Algorithm;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science & Education (ICCSE), 2015 10th International Conference on
Conference_Location
Cambridge, United Kingdom
Print_ISBN
978-1-4799-6598-4
Type
conf
DOI
10.1109/ICCSE.2015.7250325
Filename
7250325
Link To Document