Improving Hash Join Performance through Prefetching

  • Authors:
  • Shimin Chen;Anastassia Ailamaki;Phillip B. Gibbons;Todd C. Mowry

  • Affiliations:
  • -;-;-;-

  • Venue:
  • ICDE '04 Proceedings of the 20th International Conference on Data Engineering
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Hash join algorithms suffer from extensive CPU cachestalls. This paper shows that the standard hash join algorithm for disk-oriented databases (i.e. GRACE) spends over73% of its user time stalled on CPU cache misses, and explores the use of prefetching to improve its cache performance. Applying prefetching to hash joins is complicatedby the data dependencies, multiple code paths, and inherent randomness of hashing. We present two techniques, group prefetching and software-pipelined prefetching, thatovercome these complications.These schemes achieve 2.0- 2.9X speedups for the join phase and 1.4-2.6X speedups forthe partition phase over GRACE and simple prefetching approaches. Compared with previous cache-aware approaches(i.e. cache partitioning), the schemes are at least 50% fasteron large relations and do not require exclusive use of theCPU cache to be effective.