DocumentCode :
1396032
Title :
Analysis of range queries and self-spatial join queries on real region datasets stored using an R-tree
Author :
Proietti, Guido ; Faloutsos, Christos
Author_Institution :
Dipt. di Matematica Pura ed Applicata, L´´Aquila Univ., Italy
Volume :
12
Issue :
5
fYear :
2000
Firstpage :
751
Lastpage :
762
Abstract :
In this paper, we study the node distribution of an R-tree storing region data, like, for instance, islands, lakes, or human-inhabited areas. We will show that real region datasets are packed in an R-tree into minimum bounding rectangles (MBRs) whose area distribution follows the same power law, named REGAL (REGion Area Law), as that for the regions themselves. Moreover, these MBRs are packed in their turn into MBRs following the same law, and so on iteratively, up to the root of the R-tree. Based on this observation, we are able to accurately estimate the search effort for range queries, using a small number of easy-to-retrieve parameters. Furthermore, since our analysis exploits, through a realistic mathematical model, the proximity relations existing among the regions in the dataset, we show how to use our model to predict the selectivity of a self-spatial join query posed on the dataset. Experiments on a variety of real datasets (islands, lakes, human-inhabited areas) show that our estimations are accurate, enjoying a geometric average relative error ranging from 22 percent to 32 percent for the search effort of a range query, and from 14 percent to 34 percent for the selectivity of a self-spatial join query. This is significantly better than using a naive model based on uniformity assumption, which gives rise to a geometric average relative error up to 270 percent and up to 85 percent for the two problems, respectively
Keywords :
computational complexity; query processing; tree data structures; visual databases; R-tree; R-tree storing region data; REGAL; geometric average relative error; human-inhabited areas; mathematical model; minimum bounding rectangles; node distribution; proximity relations; range queries; real datasets; real region datasets; self-spatial join queries; self-spatial join query; Data structures; Helium; Lakes; Mathematical model; Nearest neighbor searches; Performance analysis; Power measurement; Predictive models; Solid modeling; Spatial databases;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/69.877506
Filename :
877506
Link To Document :
بازگشت