(Almost) Optimal parallel block access for range queries

  • Authors:
  • Mikhail J. Atallah;Sunil Prabhakar

  • Affiliations:
  • CERIAS and Department of Computer Science, Purdue University, West Lafayette, IN;Department of Computer Sciences, 1398 Computer Sciences Building, Purdue University, West Lafayette, IN

  • Venue:
  • Information Sciences—Informatics and Computer Science: An International Journal
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Range queries are an important class of queries for several applications can be achieved by tiling the multi-dimensional data and distributing it among multiple disks or nodes. It has been established that schemes that achieve optimal parallel block access exist only for a few special cases. Though several schemes for the allocation of tiles to disks have been developed, no scheme with non-trivial worst-case bound is known. We establish that any range query on a 2q × 2q-block grid of blocks can be performed using k = 2t disks (t ≤ q), in at most ⌈m/k⌉ + O(logk) parallel block accesses. We achieve this result by judiciously distributing the blocks among the k nodes or disks. Experimental data show that the algorithm achieves very close to ⌈m/k⌉ performance (on average less than 0.5 away from ⌈m/k⌉, with a worst-case of 3). Although several declustering schemes for range queries have been developed, prior to our work no additive non-trivial performance bounds were known. Our scheme guarantees performance within a (small) additive deviation from ⌈m/k⌉. Subsequent to this work, Bhatia et al. [Hierarchical declustering schemes for range queries, in: Proceedings of International Conference on Extending Database Technology, Konstanz, Germany, March 2000, p. 525] have proved that such a performance bound is essentially optimal for this kind of scheme, and have also extended our results to the case where the number of disks is a product of the form k1 * k2 * ... * kt where the kis need not all be 2.