Processing distance-based queries in multidimensional data spaces using R-trees

Authors:
Antonio Corral;Joaquin Cañadas;Michael Vassilakopoulos
Affiliations:
Department of Languages and Computation, University of Almeria, Almeria, Spain;Department of Languages and Computation, University of Almeria, Almeria, Spain;Department of Information Technology, Technological Educational Institute of Thessaloniki, Greece
Venue:
PCI'01 Proceedings of the 8th Panhellenic conference on Informatics
Year:
2001

Citing 23
Cited 1

Computational geometry: an introduction

Computational geometry: an introduction
The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
A retrieval technique for similar shapes

SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Efficient and effective querying by image content

Journal of Intelligent Information Systems - Special issue: advances in visual information management systems
Nearest neighbor queries

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
The SR-tree: an index structure for high-dimensional nearest neighbor queries

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Incremental distance join algorithms for spatial databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Multidimensional access methods

ACM Computing Surveys (CSUR)
Enhanced nearest neighbour search on the R-tree

ACM SIGMOD Record
An optimal algorithm for approximate nearest neighbor searching fixed dimensions

Journal of the ACM (JACM)
Distance browsing in spatial databases

ACM Transactions on Database Systems (TODS)
Closest pair queries in spatial databases

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Adaptive multi-stage distance join processing

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
The TV-tree: an index structure for high-dimensional data

The VLDB Journal — The International Journal on Very Large Data Bases - Spatial Database Systems
High Dimensional Similarity Joins: Algorithms and Performance Evaluation

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Approximation

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Fast Nearest Neighbor Search in Medical Image Databases

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Approximate Algorithms for Distance-Based Queries in High-Dimensional Data Spaces Using R-Trees

ADBIS '02 Proceedings of the 6th East European Conference on Advances in Databases and Information Systems
Ranking in Spatial Databases

SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases
Efficiently Supporting Multiple Similarity Queries for Mining in Metric Databases

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
PAC Nearest Neighbor Queries: Approximate and Controlled Search in High-Dimensional and Metric Spaces

ICDE '00 Proceedings of the 16th International Conference on Data Engineering

PL-Tree: an efficient indexing method for high-dimensional data

SSTD'13 Proceedings of the 13th international conference on Advances in Spatial and Temporal Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

In modern database applications the similarity, or dissimilarity of data objects is examined by performing distance-based queries (DBQs) on multidimensional data. The R-tree and its variations are commonly cited multidimensional access methods. In this paper, we investigate the performance of the most representative distance-based queries in multidimensional data spaces, where the point datasets are indexed by tree-like structures belonging to the R-tree family. In order to perform the K-nearest neighbor query (K-NNQ) and the K-closest pair query (K-CPQ), non-incremental recursive branch-and-bound algorithms are employed. The K-CPQ is shown to be a very expensive query for datasets of high cardinalities that becomes even more costly as the dimensionality increases. We also give ɛ-approximate versions of DBQ algorithms that can be performed faster than the exact ones, at the expense of introducing a distance relative error of the result. Experimentation with synthetic multidimensional point datasets, following Uniform and Gaussian distributions, reveals that the best index structure for K-NNQ is the X-tree. However, for K-CPQ, th e R*-tree outperforms th e X-tree in respect to the response time and the number of disk accesses, when an LRU buffer is used. Moreover, the application of the ɛ-approximate technique on the recursive K-CPQ algorithm leads to acceptable approximations of the result quickly, although the tradeoff between cost and accuracy cannot be easily controlled by the users.