Cost-Based Unbalanced R-Trees

Authors:
Kenneth A. Ross;Inga Sitzmann;Peter J. Stuckey
Affiliations:
-;-;-
Venue:
SSDBM '01 Proceedings of the 13th International Conference on Scientific and Statistical Database Management
Year:
2001

Citing 0
Cited 4

Compacting discriminator information for spatial trees

ADC '02 Proceedings of the 13th Australasian database conference - Volume 5
Object-based and image-based object representations

ACM Computing Surveys (CSUR)
A portal for access to complex distributed information about energy

dg.o '02 Proceedings of the 2002 annual national conference on Digital government research
Query Responsive Index Structures

GIScience '08 Proceedings of the 5th international conference on Geographic Information Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

Abstract: Cost-based unbalanced R-trees (CUR-trees) are a cost-function based data structure for spatial data. CUR-trees are constructed specifically to improve the evaluation of intersection queries, the most basic selection query in an R-tree. A CUR-tree is built taking into account a given query distribution for the queries and a cost model for their execution. Depending on the expected frequency of access, objects or subtrees are stored higher up in the tree. After each insertion in the tree, local reorganizations of a node and its children have their expected query cost evaluated, and a reorganization is performed if this is beneficial. No strict balancing of the trees applies allowing the tree to unfold solely based on the result of the cost evaluation. We present our cost-based approach and describe the evaluation and reorganization operations based on the cost function. We present a cost model for in-memory access costs and we present three different query models. In our experiments, we compare the performance of the CUR-tree to the R-tree and the R*-tree. The CUR-tree is able to significantly improve intersection query performance, without unacceptably increasing the cost to build the tree. The use of R-trees for in-memory data reflects the high (and growing) cost of bringing data from RAM into the CPU cache relative to the cost of other computation.