Title :
Saturn: Range Queries, Load Balancing and Fault Tolerance in DHT Data Systems
Author :
Pitoura, Theoni ; Ntarmos, Nikos ; Triantafillou, Peter
Author_Institution :
Dept. of Comput. Eng. & Inf., Univ. of Patras, Patras, Greece
fDate :
7/1/2012 12:00:00 AM
Abstract :
In this paper, we present Saturn, an overlay architecture for large-scale data networks maintained over Distributed Hash Tables (DHTs) that efficiently processes range queries and ensures access load balancing and fault-tolerance. Placing consecutive data values in neighboring peers is desirable in DHTs since it accelerates range query processing; however, such a placement is highly susceptible to load imbalances. At the same time, DHTs may be susceptible to node departures/failures and high data availability and fault tolerance are significant issues. Saturn deals effectively with these problems through the introduction of a novel multiple ring, order-preserving architecture. The use of a novel order-preserving hash function ensures fast range query processing. Replication across and within data rings (termed vertical and horizontal replication) forms the foundation over which our mechanisms are developed, ensuring query load balancing and fault tolerance, respectively. Our detailed experimentation study shows strong gains in range query processing efficiency, access load balancing, and fault tolerance, with low replication overheads. The significance of Saturn is not only that it effectively tackles all three issues together - i.e., supporting range queries, ensuring load balancing, and providing fault tolerance over DHTs - but also that it can be applied on top of any order-preserving DHT enabling it to dynamically handle replication and, thus, to trade off replication costs for fair load distribution and fault tolerance.
Keywords :
fault tolerant computing; file organisation; resource allocation; DHT data system; Saturn; access load balancing; consecutive data values; data availability; distributed hash table; fault tolerance; large scale data networks; load distribution; multiple ring; order-preserving DHT; order-preserving architecture; order-preserving hash function; overlay architecture; query load balancing; range query processing efficiency; replication overhead; Arrays; Fault tolerance; Fault tolerant systems; Load management; Peer to peer computing; Query processing; Saturn; Distributed databases; distributed applications; fault tolerance; internet applications.; query processing;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2010.266