Optimization of Parallel Execution for Multi-Join Queries

Authors:
Ming-Syan Chen;Philip S. Yu;Kun-Lung Wu
Affiliations:
-;-;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
1996

Citing 47
Cited 10

Design and evaluation of parallel pipelined join algorithms

SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
Multiple-query optimization

ACM Transactions on Database Systems (TODS)
A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment

SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Database Operations in a Cube-Connected Multicomputer System

IEEE Transactions on Computers
On the effect of join operations on relation sizes

ACM Transactions on Database Systems (TODS)
Percentile finding algorithm for multiple sorted runs

VLDB '89 Proceedings of the 15th international conference on Very large data bases
Hash-based join algorithms for multiprocessor computers with shared memory

Proceedings of the sixteenth international conference on Very large databases
Measuring the complexity of join enumeration in query optimization

Proceedings of the sixteenth international conference on Very large databases
Tradeoffs in processing complex join queries via hashing in multiprocessor database machines

Proceedings of the sixteenth international conference on Very large databases
Join processing in relational databases

ACM Computing Surveys (CSUR)
Parallel database systems: the future of high performance database systems

Communications of the ACM
Query optimization for parallel execution

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Exploiting inter-operation parallelism in XPRS

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Sequential sampling procedures for query size estimation

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
On Workload Characterization of Relational Database Environments

IEEE Transactions on Software Engineering
Exploiting database parallelism in a message-passing multiprocessor

IBM Journal of Research and Development
On optimal processor allocation to support pipelined hash joins

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
On parallel execution of multiple pipelined hash joins

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
A Hierarchical Approach to Parallel Multiquery Scheduling

IEEE Transactions on Parallel and Distributed Systems
An overview of DB2 parallel edition

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Parallelism in relational data base systems: architectural issues and design approaches

DPDS '90 Proceedings of the second international symposium on Databases in parallel and distributed systems
Multi-join on parallel processors

DPDS '90 Proceedings of the second international symposium on Databases in parallel and distributed systems
Query Optimization in Database Systems

ACM Computing Surveys (CSUR)
The Kendall Square Query Decomposer

PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
Oracle parallel RDBMS on massively parallel systems

PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
Parallel features of NonStop SQL

PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
Informix parallel data query (PDQ)

PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
Parallel query processing in DBS3

PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
DBS3: a parallel database system for shared store

PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Considering data skew factor in multi-way join query optimization for parallel execution

The VLDB Journal — The International Journal on Very Large Data Bases - Parallelism in database systems
Hash-Based and Index-Based Join Algorithms for Cube and Ring Connected Multicomputers

IEEE Transactions on Knowledge and Data Engineering
Prototyping Bubba, A Highly Parallel Database System

IEEE Transactions on Knowledge and Data Engineering
The Gamma Database Machine Project

IEEE Transactions on Knowledge and Data Engineering
Effectiveness of Parallel Joins

IEEE Transactions on Knowledge and Data Engineering
Applying Segmented Right-Deep Trees to Pipelining Multiple Hash Joins

IEEE Transactions on Knowledge and Data Engineering
A Parallel Sort Merge Join Algorithm for Managing Data Skew

IEEE Transactions on Parallel and Distributed Systems
Combining Joint and Semi-Join Operations for Distributed Query Processing

IEEE Transactions on Knowledge and Data Engineering
A Performance Comparison of Two Architectures for Fast Transaction Processing

Proceedings of the Fourth International Conference on Data Engineering
System Issues in Parallel Sorting for Database Systems

Proceedings of the Sixth International Conference on Data Engineering
Scheduling and Processor Allocation for Parallel Execution of Multi-Join Queries

Proceedings of the Eighth International Conference on Data Engineering
The Development of the CROSS8 and HC16-186 Parallel (Database) Computers

IWDM '89 Proceedings of the Sixth International Workshop on Database Machines
The Design of XPRS

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Optimization of Multi-Way Join Queries for Parallel Execution

VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Practical Skew Handling in Parallel Joins

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
On the Effectiveness of Optimization Search Strategies for Parallel Execution Spaces

VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Parallel Query Processing

Advanced Database Systems

Processing Distributed Mobile Queries with Interleaved Remote Mobile Joins

IEEE Transactions on Computers
Parallel Star Join + DataIndexes: Efficient Query Processing in Data Warehouses and OLAP

IEEE Transactions on Knowledge and Data Engineering
On applying hash filters to improving the execution of multi-join queries

The VLDB Journal — The International Journal on Very Large Data Bases
Distributed Query Processing in the Internet: Exploring Relation Replication and Network Characteristics

ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
Shortening Matching Time in OPS5 Production Systems

IEEE Transactions on Software Engineering
Query Processing in a Mobile Computing Environment: Exploiting the Features of Asymmetry

IEEE Transactions on Knowledge and Data Engineering
Partition search for non-binary constraint satisfaction

Information Sciences: an International Journal
Towards Parallel Processing of RDF Queries in DHTs

Globe '09 Proceedings of the 2nd International Conference on Data Management in Grid and Peer-to-Peer Systems
Tree balance and node allocation

IDEAS'97 Proceedings of the 1997 international conference on International database engineering and applications symposium
Query optimization for massively parallel data processing

Proceedings of the 2nd ACM Symposium on Cloud Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we study the subject of exploiting interoperator parallelism to optimize the execution of multi-join queries. Specifically, we focus on two major issues: 1) scheduling the execution sequence of multiple joins within a query, and 2) determining the number of processors to be allocated for the execution of each join operation obtained in 1). For the first issue, we propose and evaluate by simulation several methods to determine the general join sequences, or bushy trees. Despite their simplicity, the heuristics proposed can lead to the general join sequences that significantly outperform the optimal sequential join sequence. The quality of the join sequences obtained by the proposed heuristics is shown to be fairly close to that of the optimal one. For the second issue, it is shown that the processor allocation for exploiting interoperator parallelism is subject to more constraints驴such as execution dependency and system fragmentation驴than those in the study of intraoperator parallelism for a single join. The concept of synchronous execution time is proposed to alleviate these constraints. Several heuristics to deal with the processor allocation, categorized by bottom-up and top-down approaches, are derived and are evaluated by simulation. The relationship between issues 1) and 2) is explored. Among all the schemes evaluated, the two-step approach proposed, which first applies the join sequence heuristic to build a bushy tree as if under a single processor system, and then, in light of the concept of synchronous execution time, allocates processors to execute each join in the bushy tree in a top-down manner, emerges as the best solution to minimize the query execution time.