Correcting execution of distributed queries

Authors:
P. Bodorik;J. Pyra;J. S. Riordon
Affiliations:
School of Computer Science, Technical University of Nova Scotia, P.O. Box 1000, Halifax, Nova Scotia, B3J 2X4, Canada;School of Computer Science, Technical University of Nova Scotia, P.O. Box 1000, Halifax, Nova Scotia, B3J 2X4, Canada;Dept. of Systems and Computer Engineering, Carleton University, Ottawa, Ontario, K1S 5B6, Canada
Venue:
DPDS '90 Proceedings of the second international symposium on Databases in parallel and distributed systems
Year:
1990

Citing 37
Cited 4

Distributed databases principles and systems

Distributed databases principles and systems
On the optimal nesting order for computing N-relational joins

ACM Transactions on Database Systems (TODS)
Random sampling with a reservoir

ACM Transactions on Mathematical Software (TOMS)
Distributed query processing in a relational database system

The INGRES papers: anatomy of a relational database system
A state transition model for distributed query processing

ACM Transactions on Database Systems (TODS)
Set query optimization in distributed database systems

ACM Transactions on Database Systems (TODS)
Optimizing joins between two partitioned relations in distributed databases

Journal of Parallel and Distributed Computing
R* optimizer validation and performance evaluation for local queries

SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
The effect of join selectives on optimal nesting order

ACM SIGMOD Record
Shortest Semijoin Schedule for a Local Area Distributed Database System

IEEE Transactions on Software Engineering
Algorithms to Process Distributed Queries in Fast Local Networks

IEEE Transactions on Computers
Multiple-query optimization

ACM Transactions on Database Systems (TODS)
A query processing algorithm for distributed relational database systems

The Computer Journal
Heuristic algorithms for distributed query processing

DPDS '88 Proceedings of the first international symposium on Databases in parallel and distributed systems
Dynamic distributed query processing techniques

CSC '89 Proceedings of the 17th conference on ACM Annual Computer Science Conference
Implications of certain assumptions in database performance evauation

ACM Transactions on Database Systems (TODS)
Query processing in a system for distributed databases (SDD-1)

ACM Transactions on Database Systems (TODS)
A statistical approach to incomplete information in database systems

ACM Transactions on Database Systems (TODS)
Query optimization in star computer networks

ACM Transactions on Database Systems (TODS)
A threshold mechanism for distributed query processing

CSC '88 Proceedings of the 1988 ACM sixteenth annual conference on Computer science
Time, clocks, and the ordering of events in a distributed system

Communications of the ACM
Distributed query processing in a relational data base system

SIGMOD '78 Proceedings of the 1978 ACM SIGMOD international conference on management of data
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Distributed Query Processing Optimization Objectives

Proceedings of the Fourth International Conference on Data Engineering
Pragmatic Estimation of Join Sizes and Attribute Correlations

Proceedings of the Fifth International Conference on Data Engineering
Distributed Query Evaluation in Local Area Networks

Proceedings of the First International Conference on Data Engineering
Estimating Bucket Accesses: A Practical Approach

Proceedings of the Second International Conference on Data Engineering
Estimating Temporary Files Sizes in Distributed Relational Database Systems

Proceedings of the Second International Conference on Data Engineering
Adaptive Techniques for Distributed Query Optimization

Proceedings of the Second International Conference on Data Engineering
A Distributed Query Processing Strategy Using Decomposition, Pipelining and Intermediate Result Sharing Techniques

Proceedings of the Second International Conference on Data Engineering
Load Control and Load Balancing in a Shared Database Management System

Proceedings of the Second International Conference on Data Engineering
Estimating Block Accessses when Attributes are Correlated

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Simple Random Sampling from Relational Databases

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
R* Optimizer Validation and Performance Evaluation for Distributed Queries

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
An Analytical Method for Estimating and Interpreting Query Time

VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
A Performance Study of Query Optimization Algorithms on a Database System Supporting Procedures

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Query Transformation for PSJ-Queries

VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases

Optimization of dynamic query evaluation plans

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Automated Tuning of Parallel I/O Systems: An Approach to Portable I/O Performance for Scientific Applications

IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools parallel processing
Deciding to Correct Distributed Query Processing

IEEE Transactions on Knowledge and Data Engineering
Query optimization in multidatabase systems

CASCON '92 Proceedings of the 1992 conference of the Centre for Advanced Studies on Collaborative research - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Algorithms for processing distributed queries require a priori estimates of the size of intermediate relations. Most such algorithms take a “static” approach in which the algorithm is completely determined before processing begins. If size estimates are found to be inaccurate at some intermediate stage, there is no opportunity to re-schedule, and the result may be far from optimal. Adaptive query execution may be used to alleviate the problem. Care is necessary, though, to ensure that the delay associated with re-scheduling does not exceed the time saved through the use of a more efficient strategy. This paper presents a low overhead delay method to decide when to correct a strategy. Sampling is used to estimate the size of relations, and alternative heuristic strategies prepared in a background mode are used to decide when to correct. Correction is made only if lower overall delay is achieved, including correction time. Evaluation using a model of a distributed data base indicates that the heuristic strategies are near optimal. Moreover, it also suggests that it is usually correct to abort creation of an intermediate relation which is much larger than predicted.