DocumentCode :
2718197
Title :
A Class-Based Search System in Unstructured P2P Networks
Author :
Huang, Juncheng ; Li, Xiuqi ; Wu, Jie
Author_Institution :
Dept. of Comput. Sci. & Eng., Florida Atlantic Univ., Boca Raton, FL
fYear :
2007
fDate :
21-23 May 2007
Firstpage :
76
Lastpage :
83
Abstract :
Efficient searching is one of the important design issues in peer-to-peer (P2P) networks. Among various searching techniques, semantic-based searching has drawn significant attention recently. Gnutella-like efficient searching system (GES) in the work of Zhu et al. (2005) is such a system. GES derives a node vector, a semantic summary of all of the documents on a node, based on vector space model (VSM). The topology adaptation algorithm and search protocol are then designed according to the similarity between node vectors of different nodes. However, although GES is suitable when the distribution of documents in each node is uniform, it may not be efficient when the distribution is diverse. When there are many categories of documents at each node, the node vector representation may be inaccurate. We extend the idea of GES and present a class-based semantic searching system (CSS). It makes use of a data clustering algorithm, online spherical k-means clustering (OSKM) in the work of Zhang (2005), to cluster all documents on a node into several classes. Each class can be viewed as a virtual node. Virtual nodes are connected through virtual links. As a result, class vector replaces node vector and plays an important role in the class-based topology adaptation and search process, which makes CSS very efficient. Our simulation using the IR benchmark TREC collection demonstrates that CSS outperforms GES in terms of higher recall, higher precision and lower search cost.
Keywords :
pattern clustering; peer-to-peer computing; search problems; telecommunication network topology; Gnutella-like efficient searching system; class-based search system; class-based semantic searching system; data clustering algorithm; online spherical k-means clustering; peer-to-peer networks; search protocol; topology adaptation algorithm; unstructured P2P networks; vector space model; Cascading style sheets; Clustering algorithms; Computer science; Costs; Design engineering; Floods; Network topology; Peer to peer computing; Protocols; Routing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Information Networking and Applications, 2007. AINA '07. 21st International Conference on
Conference_Location :
Niagara Falls, ON
ISSN :
1550-445X
Print_ISBN :
0-7695-2846-5
Type :
conf
DOI :
10.1109/AINA.2007.8
Filename :
4220879
Link To Document :
بازگشت