Title :
Making Search Efficient on Gnutella-Like P2P Systems
Author :
Zhu, Yingwu ; Yang, Xiaoyu ; Hu, Yiming
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Cincinnati Univ., OH, USA
Abstract :
Leveraging the state-of-the-art information retrieval (IR) algorithms like VSM and relevance ranking algorithm, we present GES, an efficient IR system built on top of Gnutellalike P2P networks. The key idea is that GES employs a distributed, content-based, and capacity-aware topology adaptation algorithm to organize nodes (each of which is represented by a node vector) into semantic groups. The intuition behind this design is that semantically associated nodes within a semantic group tend to be relevant to the same queries. Given a query, GES uses a capacity-aware search protocol based on semantic groups and selective one-hop node vector replication, to direct the query to the most relevant nodes which are responsible for the query, thereby achieving high recall with probing only a small faction of nodes. Moreover, GES adopts automatic query expansion techniques to improve quality of search results, and it is the first work to show that node vector size plays a very important role in system performance. The experimental results show that GES is very efficient, and even outperforms the centralized node clustering system like SETS.
Keywords :
information retrieval systems; peer-to-peer computing; query formulation; semantic networks; GES; Gnutella-like P2P networks; IR system; VSM; automatic query expansion technique; capacity-aware topology adaptation algorithm; information retrieval algorithm; peer-to-peer systems; relevance ranking algorithm; search protocol; selective one-hop node vector replication; semantic group; Costs; Floods; Humans; Information retrieval; Keyword search; Network topology; Protocols; Routing; System performance;
Conference_Titel :
Parallel and Distributed Processing Symposium, 2005. Proceedings. 19th IEEE International
Print_ISBN :
0-7695-2312-9
DOI :
10.1109/IPDPS.2005.273