DocumentCode :
2981522
Title :
Two-Level Result Caching for Web Search Queries on Structured P2P Networks
Author :
Rosas, Erika ; Hidalgo, Nicolas ; Marin, Mario
Author_Institution :
Yahoo! Labs., Santiago, Chile
fYear :
2012
fDate :
17-19 Dec. 2012
Firstpage :
221
Lastpage :
228
Abstract :
This paper proposes a two-level caching strategy for Web search queries which is devised to operate on P2P networks. The aim is to significantly reduce query traffic going from a large community of users to commercial search engines by placing between them a P2P caching service capable of storing and efficiently distributing frequent queries among users. The proposed design takes into consideration the highly dynamic nature of user queries both in traffic intensity and drastic shifts in user interest which are both usually driven by unpredictable world-wide events. Each peer maintains a LRU result cache (RCache) used to keep the answers for queries originated in the peer itself and queries for which the peer is responsible for by contacting on-demand a Web search engine to get the query answers. When query traffic is predominantly routed to a few responsible peers our strategy replicates the role of ``being responsible for" to neighboring peers so that they can absorb part of the traffic to restore load balance. This is a fairly slow and adaptive process that we call mid-term load balancing. To achieve a short-term fair distribution of queries we introduce in each peer a location cache (LCache) which keeps pointers to peers that have already requested the same queries in the very recent past. This lets these peers share their query answers with newly requesting peers. This process is fast as these popular queries are usually cached in the first DHT hop of a requesting peer which quickly tends to redistribute load among more and more peers. A comparative study shows that the proposed strategy achieves better load balance, significantly smaller communication volume among peers, and larger cache hit ratios than previous strategies.
Keywords :
Internet; cache storage; peer-to-peer computing; query processing; resource allocation; search engines; telecommunication traffic; DHT hop; LCache; LRU result cache; P2P caching service; RCache; Web search engine; Web search queries; distributed hash tables; frequent query storage; load redistribution; location cache; mid-term load balancing; query answering; query traffic reduction; short-term fair distribution; structured P2P networks; traffic intensity; two-level caching strategy; two-level result caching; user queries; Communities; Engines; Load management; Peer to peer computing; Routing; Search engines; Web search; Caching Services; Distributed Hash Tables; Load Balancing; P2P networks; Web Search Engines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Systems (ICPADS), 2012 IEEE 18th International Conference on
Conference_Location :
Singapore
ISSN :
1521-9097
Print_ISBN :
978-1-4673-4565-1
Electronic_ISBN :
1521-9097
Type :
conf
DOI :
10.1109/ICPADS.2012.39
Filename :
6413693
Link To Document :
بازگشت