Data Partitioning for Parallel Spatial Join Processing

  • Authors:
  • Xiaofang Zhou;David J. Abel;David Truffet

  • Affiliations:
  • CSIRO Mathematical and Information Sciences, GPO Box 664, Canberra, ACT 2601, Australia {xiaofang.zhou, dave.abel, david.truffet}@cmis.csiro.au;CSIRO Mathematical and Information Sciences, GPO Box 664, Canberra, ACT 2601, Australia {xiaofang.zhou, dave.abel, david.truffet}@cmis.csiro.au;CSIRO Mathematical and Information Sciences, GPO Box 664, Canberra, ACT 2601, Australia {xiaofang.zhou, dave.abel, david.truffet}@cmis.csiro.au

  • Venue:
  • Geoinformatica
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

The cost of spatial join processing can be very high because of thelarge sizes of spatial objects and the computation-intensive spatialoperations. While parallel processing seems a natural solution to thisproblem, it is not clear how spatial data can be partitioned for thispurpose. Various spatial data partitioning methods are examined in thispaper. A framework combining the data-partitioning techniques used by mostparallel join algorithms in relational databases and the filter-and-refinestrategy for spatial operation processing is proposed for parallel spatialjoin processing. Object duplication caused by multi-assignment in spatialdata partitioning can result in extra CPU cost as well as extracommunication cost. We find that the key to overcome this problem is topreserve spatial locality in task decomposition. In this paper we show thata near-optimal speedup can be achieved for parallel spatial join processingusing our new algorithms.