Parallel R-trees

  • Authors:
  • Ibrahim Kamel;Christos Faloutsos

  • Affiliations:
  • University of Maryland, College Park, MD;University of Maryland, College Park, MD

  • Venue:
  • SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of exploiting parallelism to accelerate the performance of spacial access methods and specifically, R-trees [11]. Our goal is to design a server for spatial data, so that to maximize the throughput of range queries. This can be achieved by (a) maximizing parallelism for large range queries, and (b) by engaging as few disks as possible on point queries [22].We propose a simple hardware architecture consisting of one processor with several disks attached to it. On this architecture, we propose to distribute the nodes of a traditonal R-tree, with cross-disk pointers (“Multiplexed” R-tree). The R-tree code is identical to the one for a single-disk R-tree, with the only addition that we have to decide which disk a newly created R-tree node should be stored in. We propose and examine several criteria to choose a disk for a new node. The most successful one, termed “proximity index” or PI, estimates the similarity of the new node with the other R-tree nodes already on a disk, and chooses the disk with the lowest similarity. Experimental results show that our scheme consistently outperforms all the other heuristics for node-to-disk assignments, achieving up to 55% gains over the Round Robin one. Experiments also indicate that the multiplexed R-tree with PI heuristic gives better response time than the disk-stripping (=“Super-node”) approach, and imposes lighter load on the I/O sub-system.The speed up of our method is close to linear speed up, increasing with the size of the queries.