DocumentCode :
1959673
Title :
Handling data skew in parallel hash join computation using two-phase scheduling
Author :
Zhou, Xiaofang ; Orlowska, Maria E.
Author_Institution :
Div. of Inf. Technol., CSIRO, Canberra, ACT, Australia
Volume :
2
fYear :
1995
fDate :
19-21 Apr 1995
Firstpage :
527
Abstract :
A large number of parallel join algorithms has been proposed to maintain load-balancing in the presence of data skew. However, one important type of data skew-join product skew (JPS)-has been little studied. In this paper, a dynamic parallel join algorithm, which employs a two-phase scheduling procedure, is designed to handle the JPS problem. Two sets of scheduling heuristics are studied against various parameters. It is shown that many of the existing algorithms can be regarded as a special case of our algorithm, whose cost is based on the nature of data skew. While it can cope with JPS which other algorithms cannot approach, it can be as efficient as most existing algorithms when JPS does not exist
Keywords :
parallel algorithms; processor scheduling; query processing; relational databases; resource allocation; data skew; dynamic parallel join algorithm; load-balancing; parallel hash join computation; two-phase scheduling; two-phase scheduling procedure; Algorithm design and analysis; Computer science; Concurrent computing; Dynamic scheduling; Government; Information technology; Parallel architectures; Processor scheduling; Relational databases; Scheduling algorithm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Algorithms and Architectures for Parallel Processing, 1995. ICAPP 95. IEEE First ICA/sup 3/PP., IEEE First International Conference on
Conference_Location :
Brisbane, Qld.
Print_ISBN :
0-7803-2018-2
Type :
conf
DOI :
10.1109/ICAPP.1995.472237
Filename :
472237
Link To Document :
بازگشت