Join algorithm costs revisited

  • Authors:
  • Evan P. Harris;Kotagiri Ramamohanarao

  • Affiliations:
  • Department of Computer Science, The University of Melbourne, Parkville VIC 3052, Australia;Department of Computer Science, The University of Melbourne, Parkville VIC 3052, Australia

  • Venue:
  • The VLDB Journal — The International Journal on Very Large Data Bases
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

A method of analysing join algorithms based upon the time required to access, transfer and perform the relevant CPU-based operations on a disk page is proposed. The costs of variations of several of the standard join algorithms, including nested block, sort-merge, GRACE hash and hybrid hash, are presented. For a given total buffer size, the cost of these join algorithms depends on the parts of the buffer allocated for each purpose. For example, when joining two relations using the nested block join algorithm, the amount of buffer space allocated for the outer and inner relations can significantly affect the cost of the join. Analysis of expected and experimental results of various join algorithms show that a combination of the optimal nested block and optimal GRACE hash join algorithms usually provide the greatest cost benefit, unless the relation size is a small multiple of the memory size. Algorithms to quickly determine a buffer allocation producing the minimal cost for each of these algorithms are presented. When the relation size is a small multiple of the amount of main memory available (typically up to three to six times), the hybrid hash join algorithm is preferable.