Interchanging group-by and join in distributed query processing

Authors:
Weipeng Paul Yan
Affiliations:
University of Waterloo, Waterloo, Ontrio
Venue:
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
Year:
1993

Citing 9
Cited 0

Statistical profile estimation in database systems

ACM Computing Surveys (CSUR)
Practical selectivity estimation through adaptive sampling

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Strategies for distributed query optimization

Information Sciences: an International Journal
Sequential sampling procedures for query size estimation

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
An instant and accurate size estimation method for joins and selections in a retrieval-intensive environment

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Estimating block transfers and join sizes

SIGMOD '83 Proceedings of the 1983 ACM SIGMOD international conference on Management of data
Selectivity Estimation and Query Optimization in Large Databases with Highly Skewed Distribution of Column Values

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
CORDS multidatabase project: research and prototype overview

CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

In previous work we have shown that the order of evaluating join and group-by can be interchanged in an SQL query under certain conditions. In many cases, performing group-by before join is a better way of evaluating the query. However, queries do exist for which it is better to perform join before group-by. When the conditions for interchanging the order of join and group-by for an SQL query are satisfied, the evaluation order should be determined mainly by the objective function of the query processor. This paper shows that the conditions can be used for estimating the cost of the two alternative evaluation plans in distributed query processing; specifically, estimating the cardinalities of the results of joins in the two alternative plans. It also proposes some strategies for deciding the direction of the transformation and a procedure for deciding the evaluation order of join and group-by for a distributed query.