DocumentCode :
1038816
Title :
A graph theoretical approach to determine a join reducer sequence in distributed query processing
Author :
Chen, Ming-Syan ; Yu, Philip S.
Author_Institution :
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
Volume :
6
Issue :
1
fYear :
1994
fDate :
2/1/1994 12:00:00 AM
Firstpage :
152
Lastpage :
165
Abstract :
Semijoin has traditionally been relied upon to reduce the cost of data transmission for distributed query processing. However, judiciously applying join operations as reducers can lead to further reduction in the amount of data transmission required. In view of this fact, we explore the approach of using join operations as reducers in distributed query processing. We first show that the problem of determining a sequence of join operations for a query can be transformed to that of finding a specific type of set of cuts to the corresponding query graph, where a cut to a graph is a partition of nodes in that graph. Then, in light of this concept, we prove that the problem of determining the optimal sequence of join operations for a given query graph is of exponential complexity, thus justifying the necessity of applying heuristic approaches to solve this problem. By mapping the problem of determining a sequence of join reducers into the one of finding a set of cuts, we develop (for tree and general query graphs, respectively) efficient heuristic algorithms to determine a join reducer sequence for distributed query processing. The algorithms developed are based on the concept of divide and conquer and are of polynomial time complexity. Simulation is performed to evaluate these algorithms
Keywords :
computational complexity; database theory; distributed databases; graph theory; query processing; data transmission; distributed query processing; exponential complexity; graph theoretical approach; heuristic algorithms; heuristic approaches; join operations; join reducer sequence; optimal sequence; polynomial time complexity; query graph; semijoin; Computational efficiency; Cost function; Data communication; Database systems; Heuristic algorithms; Partitioning algorithms; Performance evaluation; Query processing; Relational databases; Tree graphs;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/69.273034
Filename :
273034
Link To Document :
بازگشت