DocumentCode :
1961870
Title :
Data redundancy and duplicate detection in spatial join processing
Author :
Dittrich, Jens-Peter ; Seeger, Bernhard
Author_Institution :
Dept. of Math. & Comput. Sci., Marburg Univ., Germany
fYear :
2000
fDate :
2000
Firstpage :
535
Lastpage :
546
Abstract :
The partition-based spatial-merge join (PBSM) of J.M. Patel and D.J. DeWitt (1996) and the size separation spatial join (S3J) of N. Koudas and K.C. Sevcik (1997) are considered to be among the most efficient methods for processing spatial (intersection) joins on two or more spatial relations. Neither method assumes the presence of pre-existing spatial indices on the relations. In this paper, we propose several improvements to these join algorithms. In particular, we deal with the impact of data redundancy and duplicate detection on the performance of these methods. For PBSM, we present a simple and inexpensive online method to detect duplicates in the response set. There is no longer any need to eliminate duplicates in a final sorting phase, as was originally suggested. We also investigate the impact of different internal algorithms on the total run-time of PBSM. For S3 J, we break with the original design goal and introduce controlled redundancy of data objects. Results of a large set of experiments with real data sets reveal that our suggested modifications to PBSM and S3J result in substantial performance improvements, where PBSM is generally superior to S3J
Keywords :
data reduction; merging; query processing; redundancy; software performance evaluation; visual databases; controlled data object redundancy; data redundancy; duplicate detection; internal algorithms; intersection joins; online method; partition-based spatial-merge join; performance; response set; run-time; size separation spatial join; sorting phase; spatial join processing; spatial relations; Availability; Computer science; Costs; Filters; Lead; Mathematics; Multidimensional systems; Partitioning algorithms; Runtime; Sorting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2000. Proceedings. 16th International Conference on
Conference_Location :
San Diego, CA
ISSN :
1063-6382
Print_ISBN :
0-7695-0506-6
Type :
conf
DOI :
10.1109/ICDE.2000.839452
Filename :
839452
Link To Document :
بازگشت