Analysis of object oriented spatial access methods
SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Towards an analysis of range query performance in spatial data structures
PODS '93 Proceedings of the twelfth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
CIKM '93 Proceedings of the second international conference on Information and knowledge management
Fast subsequence matching in time-series databases
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
A model for the prediction of R-tree performance
PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Implications of certain assumptions in database performance evauation
ACM Transactions on Database Systems (TODS)
Concrete Math
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Accurate Modeling of Region Data
IEEE Transactions on Knowledge and Data Engineering
The R+-Tree: A Dynamic Index for Multi-Dimensional Objects
VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
Estimating the Selectivity of Spatial Queries Using the `Correlation' Fractal Dimension
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Analyzing Range Queries on Spatial Data
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Hi-index | 0.00 |
In this paper, we study the node distribution of an R-tree storing region data, like, for instance, islands, lakes, or human-inhabited areas. We will show that real region datasets are packed in an R-tree into minimum bounding rectangles (MBRs) whose area distribution follows the same power law, named REGAL (REGion Area Law), as that for the regions themselves. Moreover, these MBRs are packed in their turn into MBRs following the same law, and so on iteratively, up to the root of the R-tree. Based on this observation, we are able to accurately estimate the search effort for range queries, using a small number of easy-to-retrieve parameters. Furthermore, since our analysis exploits, through a realistic mathematical model, the proximity relations existing among the regions in the dataset, we show how to use our model to predict the selectivity of a self-spatial join query posed on the dataset. Experiments on a variety of real datasets (islands, lakes, human-inhabited areas) show that our estimations are accurate, enjoying a geometric average relative error ranging from 22 percent to 32 percent for the search effort of a range query, and from 14 percent to 34 percent for the selectivity of a self-spatial join query. This is significantly better than using a naive model based on uniformity assumption, which gives rise to a geometric average relative error up to 270 percent and up to 85 percent for the two problems, respectively.