Optimal Secondary Storage Access Sequence for Performing Relational Join

  • Authors:
  • F. Fotouhi;S. Pramanik

  • Affiliations:
  • -;-

  • Venue:
  • IEEE Transactions on Knowledge and Data Engineering
  • Year:
  • 1989

Quantified Score

Hi-index 0.00

Visualization

Abstract

Two graph models are developed to determine the minimum required buffer size for achieving the theoretical lower bound on the number of disk accesses for performing relational joins. Here, the lower bound implies only one disk access per joining block or page. The first graph model is based on the block connectivity of the joining relations. Using this model, the problem of determining an ordered list of joining blocks that requires the smallest buffer is considered. It is shown that this problem as well as the problem of computing the least upper bound on the buffer size is NP-hard. The second graph model represents the page connectivity of the joining relations. It is shown that the problem of computing the least upper bound on the buffer size for the page connectivity model is also NP-hard. Heuristic procedures are presented for the page connectivity model and it is shown that the sequence obtained using the heuristics requires a near-optimal buffer size The authors also show the performance improvement of the proposed heuristics over the hybrid-has join algorithm for a wide range of join factors.