A Similarity Graph-Based Approach to Declustering Problems and Its Application towards Paralleling Grid Files

  • Authors:
  • Duen-Ren Liu;Shashi Shekhar

  • Affiliations:
  • -;-

  • Venue:
  • ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
  • Year:
  • 1995

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a new similarity-based technique for declustering data. The proposed method can adapt to available information about query distributions, data distributions, data sizes and partition-size constraints. The method is based on max-cut partitioning of a similarity graph defined over the given set of data, under constraints on the partition sizes. It maximizes the chances that a pair of data-items that are to be accessed together by queries are allocated to distinct disks. We show that the proposed method can achieve optimal speed-up for a query-set, if there exists any other declustering method which will achieve the optimal speed-up. Experiments in parallelizing grid files show that the proposed method outperforms mapping-function-based methods for interesting query distributions as well for non-uniform data distributions.