Approximate Algorithms for Distance-Based Queries in High-Dimensional Data Spaces Using R-Trees

Authors:
Antonio Corral;Joaquín Cañadas;Michael Vassilakopoulos
Affiliations:
-;-;-
Venue:
ADBIS '02 Proceedings of the 6th East European Conference on Advances in Databases and Information Systems
Year:
2002

Citing 17
Cited 7

The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
An algorithm for approximate closest-point queries

SCG '94 Proceedings of the tenth annual symposium on Computational geometry
Efficient and effective querying by image content

Journal of Intelligent Information Systems - Special issue: advances in visual information management systems
Nearest neighbor queries

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Incremental distance join algorithms for spatial databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Multidimensional access methods

ACM Computing Surveys (CSUR)
An optimal algorithm for approximate nearest neighbor searching fixed dimensions

Journal of the ACM (JACM)
Distance browsing in spatial databases

ACM Transactions on Database Systems (TODS)
Closest pair queries in spatial databases

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Adaptive multi-stage distance join processing

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
High-Dimensional Similarity Joins

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
High Dimensional Similarity Joins: Algorithms and Performance Evaluation

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
A Cost Model and Index Architecture for the Similarity Join

Proceedings of the 17th International Conference on Data Engineering
Fast Nearest Neighbor Search in Medical Image Databases

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
PAC Nearest Neighbor Queries: Approximate and Controlled Search in High-Dimensional and Metric Spaces

ICDE '00 Proceedings of the 16th International Conference on Data Engineering

Multi-Way Distance Join Queries in Spatial Databases

Geoinformatica
Accelerating approximate similarity queries using genetic algorithms

Proceedings of the 2005 ACM symposium on Applied computing
A performance comparison of distance-based query algorithms using R-trees in spatial databases

Information Sciences: an International Journal
Approximate similarity search: A multi-faceted problem

Journal of Discrete Algorithms
Processing distance-based queries in multidimensional data spaces using R-trees

PCI'01 Proceedings of the 8th Panhellenic conference on Informatics
Approximate static and continuous range search in mobile navigation

Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication
Approximate algorithms for static and continuous range queries in mobile navigation

Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In modern database applications the similarity or dissimilarity of complex objects is examined by performing distance-based queries (DBQs) on data of high dimensionality. The R-tree and its variations are commonly cited multidimensional access methods that can be used for answering such queries. Although, the related algorithms work well for low-dimensional data spaces, their performance degrades as the number of dimensions increases (dimensionality curse). In order to obtain acceptable response time in high-dimensional data spaces, algorithms that obtain approximate solutions can be used. Three approximation techniques (驴-allowance, N-consider and M-consider) and the respective recursive branch-and-bound algorithms for DBQs are presented and studied in this paper. We investigate the performance of these algorithms for the most representative DBQs (the K-nearest neighbors query and the K-closest pairs query) in high-dimensional data spaces, where the point data sets are indexed by tree-like structures belonging to the R-tree family: R*- trees and X-trees. The searching strategy is tuned according to several parameters, in order to examine the trade-off between cost (I/O activity and response time) and accuracy of the result. The outcome of the experimental evaluation is the derivation of the outperforming DBQ approximate algorithm for large high-dimensional point data sets.