Title :
Efficient query evaluation on large textual collections in a peer-to-peer environment
Author :
Zhang, Jiangong ; Suel, Torsten
Author_Institution :
Dept. of CIS, Polytech. Univ., Brooklyn, NY, USA
fDate :
31 Aug.-2 Sept. 2005
Abstract :
The authors studied the problem of evaluating ranked (top-k) queries on textual collections ranging from multiple gigabytes to terabytes in size. The authors focused on the case of a global index organization in a highly distributed environment, and consider a class of ranking functions that includes common variants of the Cosine and Okapi measures. The main bottleneck in such a scenario is the amount of communication required during query evaluation. Several efficient query evaluation schemes were proposed and their performances were evaluated. The results on real search engine query traces and over 120 million Web pages show that after careful optimization such queries can be evaluated at a reasonable cost, while challenges remain for even larger collections and more general classes of ranking functions.
Keywords :
peer-to-peer computing; query processing; Cosine measure; Okapi measure; distributed environment; global index organization; large textual collection; peer-to-peer environment; query evaluation; ranked queries; ranking function; Bandwidth; Computational Intelligence Society; Cost function; Information retrieval; Peer to peer computing; Query processing; Search engines; Streaming media; Web pages; Web search;
Conference_Titel :
Peer-to-Peer Computing, 2005. P2P 2005. Fifth IEEE International Conference on
Print_ISBN :
0-7695-2376-5