Scheduling multiple queries on a parallel machine

Authors:
Joel L. Wolf;John Turek;Ming-Syan Chen;Philip S. Yu
Affiliations:
IBM Watson Research Center, P.O. Box 704, Yorktown Heights, NY;IBM Watson Research Center, P.O. Box 704, Yorktown Heights, NY;IBM Watson Research Center, P.O. Box 704, Yorktown Heights, NY;IBM Watson Research Center, P.O. Box 704, Yorktown Heights, NY
Venue:
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Year:
1994

Citing 19
Cited 3

Algorithms

Algorithms
Complexity of scheduling parallel task systems

SIAM Journal on Discrete Mathematics
Percentile finding algorithm for multiple sorted runs

VLDB '89 Proceedings of the 15th international conference on Very large data bases
A heuristic of scheduling parallel tasks and its analysis

SIAM Journal on Computing
Exploiting inter-operation parallelism in XPRS

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Scheduling parallelizable tasks: putting it all on the shelf

SIGMETRICS '92/PERFORMANCE '92 Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Approximate algorithms scheduling parallelizable tasks

SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
A Parallel Hash Join Algorithm for Managing Data Skew

IEEE Transactions on Parallel and Distributed Systems
On parallel execution of multiple pipelined hash joins

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Scheduling parallel tasks to minimize average response time

SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
Parallel sorting on a shared-nothing architecture using probabilistic splitting

PDIS '91 Proceedings of the first international conference on Parallel and distributed information systems
An Approximation Algorithm for Scheduling Tasks on Varying Partition Sizes in Partitionable Multiprocessor Systems

IEEE Transactions on Computers
The Gamma Database Machine Project

IEEE Transactions on Knowledge and Data Engineering
Effectiveness of Parallel Joins

IEEE Transactions on Knowledge and Data Engineering
A Parallel Sort Merge Join Algorithm for Managing Data Skew

IEEE Transactions on Parallel and Distributed Systems
Scheduling and Processor Allocation for Parallel Execution of Multi-Join Queries

Proceedings of the Eighth International Conference on Data Engineering
Optimization of Multi-Way Join Queries for Parallel Execution

VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Using Segmented Right-Deep Trees for the Execution of Pipelined Hash Joins

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Practical Skew Handling in Parallel Joins

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases

On parallel execution of multiple pipelined hash joins

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Scheduling memory constrained jobs on distributed memory parallel computers

Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Distributed multi-join query processing in data grids

Information Sciences: an International Journal

Quantified Score

Hi-index	0.04

Visualization

Abstract

There has been a good deal of progress made recently towards the efficient parallelization of individual phases of single queries in multiprocessor database systems. In this paper we devise and evaluate a number of scheduling algorithms designed to handle multiple parallel queries. One of these algorithms emerges as a clear winner. This algorithm is hierarchical in nature: In the first phase, a good quality precedence-based schedule is created for each individual query and each possible number of processors. This component employs dynamic programming. In the second phase, the results of the first phase are used to create an overall schedule of the full set of queries. This component is based on previously published work on nonprecedence-based malleable scheduling. Even though the problem we are considering is NP-hard in the strong sense, the multiple query schedules generated by our hierarchical algorithm are seen experimentally to achieve results which are close to optimal.