Declustering using error correcting codes

  • Authors:
  • C. Faloutsos;D. Metaxas

  • Affiliations:
  • University of Maryland, College Park and University of Maryland Institute for Advanced Computer Studies (UMIACS);University of Maryland, College Park and University of Toronto, Ontario, CANADA

  • Venue:
  • PODS '89 Proceedings of the eighth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
  • Year:
  • 1989

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem examined is to distribute a binary Cartesian product file on multiple disks to maximize the parallelism for partial match queries. Cartesian product files appear as a result of some secondary key access methods, such as the multiattribute hashing [10], the grid file [6] etc.. For the binary case, the problem is reduced into grouping the 2n binary strings on n bits in m groups of unsimilar strings. The main idea proposed in this paper is to group the strings such that the group forms an Error Correcting Code (ECC). This construction guarantees that the strings of a given group will have large Hamming distances, i.e., they will differ in many bit positions. Intuitively, this should result into good declustering. We briefly mention previous heuristics for declustering, we describe how exactly to build a declustering scheme using an ECC, and we prove a theorem that gives a necessary condition for our method to be optimal. Analytical results show that our method is superior to older heuristics, and that it is very close to the theoretical (non-tight) bound.