Exploiting join cardinality for faster hash joins

  • Authors:
  • Michael Henderson;Bryce Cutt;Ramon Lawrence

  • Affiliations:
  • University of British Columbia, Okanagan;University of British Columbia, Okanagan;University of British Columbia, Okanagan

  • Venue:
  • Proceedings of the 2009 ACM symposium on Applied Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Hash joins combine massive relations in data warehouses, decision support systems, and scientific data stores. Faster hash join performance significantly improves query throughput, response time, and overall system performance. In this work, we demonstrate how using join cardinality improves hash join performance. The key contribution is the development of an algorithm to determine join cardinality in an arbitrary query plan. We implemented early hash join and the join cardinality algorithm in PostgreSQL. Experimental results demonstrate that early hash join has an immediate response time that is an order of magnitude faster than the existing hybrid hash join implementation. One-to-one joins execute up to 50% faster and perform significantly fewer I/Os, and one-to-many joins have similar or better performance over all memory sizes.