Title :
SPRITE: A Learning-Based Text Retrieval System in DHT Networks
Author :
Yingguang Li ; Jagadish, H.V. ; Kian-Lee Tan
Author_Institution :
National Univ. of Singapore, Singapore
Abstract :
In this paper, we propose SPRITE (selective progressive index tuning by examples), a scalable system for text retrieval in a structured P2P network. Under SPRITE, each peer is responsible for a certain number of terms. However, for each document, SPRITE learns from (past) queries to select only a small set of representative terms for indexing; and these terms are progressively refined with subsequent queries. We implemented the proposed strategy, and compare its retrieval effectiveness in terms of both precision and recall against a static scheme (without learning) and a centralized system (ideal). Our experimental results show that SPRITE is nearly as effective as the centralized system, and considerably outperforms the static scheme.
Keywords :
indexing; information retrieval; peer-to-peer computing; text analysis; DHT network; SPRITE; indexing; learning-based text retrieval system; progressive index tuning by examples; queries; retrieval effectiveness; structured P2P network; Bandwidth; Costs; Floods; Indexing; Large-scale systems; Peer to peer computing; Probes; Routing; Sprites (computer);
Conference_Titel :
Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on
Conference_Location :
Istanbul
Print_ISBN :
1-4244-0802-4
DOI :
10.1109/ICDE.2007.368969