On applying hash filters to improving the execution of multi-join queries

Authors:
Ming-Syan Chen;Hui-I Hsiao;Philip S. Yu
Affiliations:
Electrical Engineering Department, National Taiwan University, Taipei, Taiwan;IBM T.J. Watson Research Center, P.O.Box 704, Yorktown, NY 10598, USA;IBM T.J. Watson Research Center, P.O.Box 704, Yorktown, NY 10598, USA
Venue:
The VLDB Journal — The International Journal on Very Large Data Bases
Year:
1997

Citing 31
Cited 1

Optimization of large join queries

SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment

SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Optimization of large join queries: combining heuristics and combinatorial techniques

SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
On the effect of join operations on relation sizes

ACM Transactions on Database Systems (TODS)
Left-deep vs. bushy trees: an analysis of strategy spaces and its implications for query optimization

SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Join processing in relational databases

ACM Computing Surveys (CSUR)
Parallel database systems: the future of high performance database systems

Communications of the ACM
Exploiting inter-operation parallelism in XPRS

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
On Workload Characterization of Relational Database Environments

IEEE Transactions on Software Engineering
Exploiting database parallelism in a message-passing multiprocessor

IBM Journal of Research and Development
On optimal processor allocation to support pipelined hash joins

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Join and Semijoin Algorithms for a Multiprocessor Database Machine

ACM Transactions on Database Systems (TODS)
Parallelism in relational data base systems: architectural issues and design approaches

DPDS '90 Proceedings of the second international symposium on Databases in parallel and distributed systems
Implementing a relational database by means of specialzed hardware

ACM Transactions on Database Systems (TODS)
Using Semi-Joins to Solve Relational Queries

Journal of the ACM (JACM)
Query Optimization in Database Systems

ACM Computing Surveys (CSUR)
Approximating block accesses in database organizations

Communications of the ACM
A performance study of three high availability data replication strategies

PDIS '91 Proceedings of the first international conference on Parallel and distributed information systems
Optimization of parallel query execution plans in XPRS

PDIS '91 Proceedings of the first international conference on Parallel and distributed information systems
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Prototyping Bubba, A Highly Parallel Database System

IEEE Transactions on Knowledge and Data Engineering
The Gamma Database Machine Project

IEEE Transactions on Knowledge and Data Engineering
A Pipeline N-Way Join Algorithm Based on the 2-Way Semijoin Program

IEEE Transactions on Knowledge and Data Engineering
Applying Segmented Right-Deep Trees to Pipelining Multiple Hash Joins

IEEE Transactions on Knowledge and Data Engineering
Optimization of Parallel Execution for Multi-Join Queries

IEEE Transactions on Knowledge and Data Engineering
Interleaving a Join Sequence with Semijoins in Distributed Query Processing

IEEE Transactions on Parallel and Distributed Systems
Optimization of Nonrecursive Queries

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
The Design of XPRS

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Disk Shadowing

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Optimization of Multi-Way Join Queries for Parallel Execution

VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins

VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases

Dynamic adaptive data structures for monitoring data streams

Data & Knowledge Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we explore an approach of interleaving a bushy execution tree with hash filters to improve the execution of multi-join queries. Similar to semi-joins in distributed query processing, hash filters can be applied to eliminate non-matching tuples from joining relations before the execution of a join, thus reducing the join cost. Note that hash filters built in different execution stages of a bushy tree can have different costs and effects. The effect of hash filters is evaluat ed first. Then, an efficient scheme to determine an effective sequence of hash filters for a bushy execution tree is developed, where hash filters are built and applied based on the join sequence specified in the bushy tree so that not only is the reduction effect optimized but also the cost associated is minimized. Various schemes using hash filters are implemented and evaluated via simulation. It is experimentally shown that the application of hash filters is in general a very powerful means to improve th e execution of multi-join queries, and the improvement becomes more prominent as the number of relations in a query increases.