Declustering of key-based partitioned signature files

  • Authors:
  • Paolo Ciaccia;Paolo Tiberio;Pavel Zezula

  • Affiliations:
  • Univ. of Bologna, Bologna, Italy;Univ. of Bologna, Bologna, Italy;Technical Univ. of Brno, Brno, Czech Republic

  • Venue:
  • ACM Transactions on Database Systems (TODS)
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

Access methods based on signature files can largely benefit from possibilities offered by parallel environments. To this end, an effective declustering strategy that would distribute signatures over a set of parallel independent disks has to be combined with a synergic clustering which is employed to avoid searching the whole signature file while executing a query. This article proposes two parallel signature file organizations, Hamming Filter (HF) and Hamming + Filter (H+F), whose common declustering strategy is based on error correcting codes, and where clustering is achieved by organizing signatures into fixed-size buckets, each containing signatures sharing the same key value. HF allocates signatures on disks in a static way and works well if a correct relationship holds between the parameters of the code and the size of the file. H+F is a generalization of HF suitable to manage highly dynamic files. It uses a dynamic declustering, obtained through a sequence of codes, and organizes a smooth migration of signatures between disks so that high performance levels are retained regardless of current file size. Theoretical analysis characterizes the best-case, expected, and worst-case behaviors of these organizations. Analytical results are verified by experiments on prototype systems.