• DocumentCode
    2170187
  • Title

    A distributed search engine based on a re-ranking algorithm model

  • Author

    Wan, Jingyong ; Wang, Beizhan ; Guo, Wei ; Chen, Kang ; Wang, Jiajun

  • Author_Institution
    Software School of Xiamen University, Xiamen, China
  • fYear
    2015
  • fDate
    22-24 July 2015
  • Firstpage
    640
  • Lastpage
    644
  • Abstract
    With the rapid increase of websites and the explosive growth of Internet information, the centralized search engine will face great challenge in mass data processing and mass data storage. However, the distributed search engine can solve the problem effectively. In this paper, we describe the design and implementation of a distributed search engine that is based on Apache Nutch, Solr and Hadoop. Considering users click logs, we propose a re-ranking algorithm based on Lucene scoring. Our experimental results show that our approaches significantly satisfy users´ massive data searching demand while maintaining high reliability and scalability.
  • Keywords
    Algorithm design and analysis; Indexing; Search engines; Servers; Software; Web pages; Distributed Search Engine; Hadoop; Re-ranking Algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science & Education (ICCSE), 2015 10th International Conference on
  • Conference_Location
    Cambridge, United Kingdom
  • Print_ISBN
    978-1-4799-6598-4
  • Type

    conf

  • DOI
    10.1109/ICCSE.2015.7250325
  • Filename
    7250325