Title :
A distributed search engine based on a re-ranking algorithm model
Author :
Wan, Jingyong ; Wang, Beizhan ; Guo, Wei ; Chen, Kang ; Wang, Jiajun
Author_Institution :
Software School of Xiamen University, Xiamen, China
Abstract :
With the rapid increase of websites and the explosive growth of Internet information, the centralized search engine will face great challenge in mass data processing and mass data storage. However, the distributed search engine can solve the problem effectively. In this paper, we describe the design and implementation of a distributed search engine that is based on Apache Nutch, Solr and Hadoop. Considering users click logs, we propose a re-ranking algorithm based on Lucene scoring. Our experimental results show that our approaches significantly satisfy users´ massive data searching demand while maintaining high reliability and scalability.
Keywords :
Algorithm design and analysis; Indexing; Search engines; Servers; Software; Web pages; Distributed Search Engine; Hadoop; Re-ranking Algorithm;
Conference_Titel :
Computer Science & Education (ICCSE), 2015 10th International Conference on
Conference_Location :
Cambridge, United Kingdom
Print_ISBN :
978-1-4799-6598-4
DOI :
10.1109/ICCSE.2015.7250325