DocumentCode :
3268258
Title :
Hash-merge join: a non-blocking join algorithm for producing fast and early join results
Author :
Mokbel, Mohamed F. ; Lu, Ming ; Aref, Walid G.
Author_Institution :
Dept. of Comput. Sci., Purdue Univ., West Lafayette, IN, USA
fYear :
2004
fDate :
30 March-2 April 2004
Firstpage :
251
Lastpage :
262
Abstract :
We introduce the hash-merge join algorithm (HMJ, for short); a new nonblocking join algorithm that deals with data items from remote sources via unpredictable, slow, or bursty network traffic. The HMJ algorithm is designed with two goals in mind: (1) minimize the time to produce the first few results, and (2) produce join results even if the two sources of the join operator occasionally get blocked. The HMJ algorithm has two phases: The hashing phase and the merging phase. The hashing phase employs an in-memory hash-based join algorithm that produces join results as quickly as data arrives. The merging phase is responsible for producing join results if the two sources are blocked. Both phases of the HMJ algorithm are connected via a flushing policy that flushes in-memory parts into disk storage once the memory is exhausted. Experimental results show that HMJ combines the advantages of two state-of-the-art nonblocking join algorithms (XJoin and Progressive Merge Join) while avoiding their shortcomings.
Keywords :
file organisation; merging; program verification; query processing; Progressive Merge Join; XJoin; algorithm correctness proving; disk storage; hashing phase; in-memory hash-based join algorithm; join operator; merging phase; network traffic; nonblocking join algorithm; Data engineering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2004. Proceedings. 20th International Conference on
ISSN :
1063-6382
Print_ISBN :
0-7695-2065-0
Type :
conf
DOI :
10.1109/ICDE.2004.1320002
Filename :
1320002
Link To Document :
بازگشت