A performance comparison of distance-based query algorithms using R-trees in spatial databases

  • Authors:
  • Antonio Corral;Jesús M. Almendros-Jiménez

  • Affiliations:
  • Department of Languages and Computing, University of Almeria, 04120 Almeria, Spain;Department of Languages and Computing, University of Almeria, 04120 Almeria, Spain

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2007

Quantified Score

Hi-index 0.07

Visualization

Abstract

Efficient processing of distance-based queries (DBQs) is of great importance in spatial databases due to the wide area of applications that may address such queries. The most representative and known DBQs are the K Nearest Neighbors Query (KNNQ), @r Distance Range Query (@rDRQ), K Closest Pairs Query (KCPQ) and @r Distance Join Query (@rDJQ). In this paper, we propose new pruning mechanism to apply them in the design of new Recursive Best-First Search (RBFS) algorithms for DBQs between spatial objects indexed in R-trees. RBFS is a general search algorithm that runs in linear space and expands nodes in best-first order, but it can suffer from node re-expansion overhead (i.e. to expand nodes in best-first order, some nodes can be considered more than once). The R-tree and its variations are commonly cited spatial access methods that can be used for answering such spatial queries. Moreover, an exhaustive experimental study was also included using R-trees, which resulted to several conclusions about the efficiency of proposed RBFS algorithm and its comparison with respect to other search algorithms (Best-First Search (BFS) and Depth-First Branch-and-Bound (DFBnB)), in terms of disk accesses, response time and main memory requirements, taking into account several important parameters as maximum branching factor (Cmax), cardinality of the final query result (K), distance threshold (@r) and size of a global LRU buffer (B). In general RBFS is competitive for KNNQ and KCPQ where the maximum branching factor (Cmax) is large enough (even better than DFBnB and very close to BFS), and it is a good alternative when we have main memory limitations in our computer due to high process overload in our system, since it is linear space consuming with respect to the height of the R-trees. Nevertheless, RBFS is the worst alternative for @rDRQ and @rDJQ. DFBnB is also a linear space algorithm and it obtains the same behavior as BFS for @rDRQ and @rDJQ; and it is the best when an LRU buffer was included. Finally, we have been able to check experimentally that BFS is the best for all DBQs, but it can consume many main memory resources to perform spatial queries.