Improving hash join performance through prefetching

  • Authors:
  • Shimin Chen;Anastassia Ailamaki;Phillip B. Gibbons;Todd C. Mowry

  • Affiliations:
  • Intel Research Pittsburgh, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Intel Research Pittsburgh, Pittsburgh, PA;Carnegie Mellon University and Intel Research Pittsburgh, Pittsburgh, PA

  • Venue:
  • ACM Transactions on Database Systems (TODS)
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Hash join algorithms suffer from extensive CPU cache stalls. This article shows that the standard hash join algorithm for disk-oriented databases (i.e. GRACE) spends over 80% of its user time stalled on CPU cache misses, and explores the use of CPU cache prefetching to improve its cache performance. Applying prefetching to hash joins is complicated by the data dependencies, multiple code paths, and inherent randomness of hashing. We present two techniques, group prefetching and software-pipelined prefetching, that overcome these complications. These schemes achieve 1.29--4.04X speedups for the join phase and 1.37--3.49X speedups for the partition phase over GRACE and simple prefetching approaches. Moreover, compared with previous cache-aware approaches (i.e. cache partitioning), the schemes are at least 36% faster on large relations and do not require exclusive use of the CPU cache to be effective. Finally, comparing the elapsed real times when disk I/Os are in the picture, our cache prefetching schemes achieve 1.12--1.84X speedups for the join phase and 1.06--1.60X speedups for the partition phase over the GRACE hash join algorithm.