DocumentCode :
2869469
Title :
Comparative performance of parallel join algorithms
Author :
Wolf, Joel L. ; Dias, Daniel M. ; Yu, Philip S. ; Turek, John
Author_Institution :
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
fYear :
1991
fDate :
4-6 Dec 1991
Firstpage :
78
Lastpage :
88
Abstract :
The authors recently (1990, 1991) described two new join algorithms designed to address the data skew problem. These algorithms were based, respectively, on the traditional sort merge and hash join algorithms, and employed techniques borrowed from mathematical optimization theory. The current paper proposes significant improvements to both algorithms, increasing their effectiveness while simultaneously decreasing their execution times. It then focuses on the comparative performance of the improved algorithms and their more conventional sort merge and hash counterparts. The latter two are perfectly good algorithms except that they fail to deal with data skew. Both I/O- and CPU-bound configurations were examined. The new algorithms outperform their more conventional counterparts in the presence of just about any skew at all, dramatically so in cases of high skew
Keywords :
database theory; file organisation; merging; parallel algorithms; performance evaluation; relational databases; sorting; comparative performance; data skew; hash join; load balancing; mathematical optimization; parallel join algorithms; relational databases; sort merge; Algorithm design and analysis; Costs; Filters; Load management; Parallel algorithms; Parallel architectures; Parallel processing; Partitioning algorithms; Relational databases; Robustness;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Information Systems, 1991., Proceedings of the First International Conference on
Conference_Location :
Miami Beach, FL
Print_ISBN :
0-8186-2295-4
Type :
conf
DOI :
10.1109/PDIS.1991.183070
Filename :
183070
Link To Document :
بازگشت