Using slice join for efficient evaluation of multi-way joins

Authors:
Ramon Lawrence
Affiliations:
University of British Columbia Okanagan, Computer Science, 3333 University Way, Kelowna, BC, Canada V1V 1V7
Venue:
Data & Knowledge Engineering
Year:
2008

Citing 12
Cited 1

The effect of bucket size tuning in the dynamic hybrid GRACE hash join method

VLDB '89 Proceedings of the 15th international conference on Very large data bases
Cost-based query scrambling for initial delays

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
An adaptive query execution system for data integration

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Implementation techniques for main memory database systems

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Hash Joins and Hash Teams in Microsoft SQL Server

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Generalised Hash Teams for Join and Group-by

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Hash-Partitioned Join Method Using Dynamic Destaging Strategy

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Early hash join: a configurable algorithm for the efficient and early production of join results

VLDB '05 Proceedings of the 31st international conference on Very large data bases
SlidingWindow based Multi-Join Algorithms over Distributed Data Streams

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Optimal top-down join enumeration

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Maximizing the output rate of multi-way join queries over streaming information sources

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Building a terabyte NEXRAD radar database for hydrometeorology research

Computers & Geosciences

An intermediate algebra for optimizing RDF graph pattern matching on MapReduce

ESWC'11 Proceedings of the 8th extended semantic web conference on The semanic web: research and applications - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

A standard hash join algorithm joins two relations at a time and requires reading the entire smaller input before results are generated. There has been recent focus on constructing join algorithms that produce results faster and can join more than two relations simultaneously. Early joins that are capable of producing results before reading the smaller relation are useful for network joins where the input arrival rates may vary as the operator can adapt without explicit query re-optimization. Multi-way joins improve performance by reducing the number of intermediate results generated and are more resilient to poor estimates by the query optimizer. The only join algorithm that combines the two features of multi-way support and early result production is limited to processing joins where all inputs are joined on the same attribute. In this work, we propose a new hash-based join algorithm called slice join. Slice join is an early, multi-way join algorithm capable of joining relations on common attributes and relations connected by a sequence of functional dependencies. Slice join is useful for a larger number of query plans, performs fewer disk operations, and has a simpler duplicate detection technique than previous approaches. Experimental results demonstrate that slice join outperforms other multi-way join operators and binary join plans.