Exploiting join cardinality for faster hash joins

Authors:
Michael Henderson;Bryce Cutt;Ramon Lawrence
Affiliations:
University of British Columbia, Okanagan;University of British Columbia, Okanagan;University of British Columbia, Okanagan
Venue:
Proceedings of the 2009 ACM symposium on Applied Computing
Year:
2009

Citing 6
Cited 0

The effect of bucket size tuning in the dynamic hybrid GRACE hash join method

VLDB '89 Proceedings of the 15th international conference on Very large data bases
Optimization of parallel query execution plans in XPRS

Distributed and Parallel Databases - Selected papers from the first international conference on parallel and distributed information systems
Dataflow query execution in a parallel main-memory environment

PDIS '91 Proceedings of the first international conference on Parallel and distributed information systems
Implementation techniques for main memory database systems

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Early hash join: a configurable algorithm for the efficient and early production of join results

VLDB '05 Proceedings of the 31st international conference on Very large data bases
The effect of reading policy on early join result production

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hash joins combine massive relations in data warehouses, decision support systems, and scientific data stores. Faster hash join performance significantly improves query throughput, response time, and overall system performance. In this work, we demonstrate how using join cardinality improves hash join performance. The key contribution is the development of an algorithm to determine join cardinality in an arbitrary query plan. We implemented early hash join and the join cardinality algorithm in PostgreSQL. Experimental results demonstrate that early hash join has an immediate response time that is an order of magnitude faster than the existing hybrid hash join implementation. One-to-one joins execute up to 50% faster and perform significantly fewer I/Os, and one-to-many joins have similar or better performance over all memory sizes.