Least expected cost query optimization: what can we expect?

Authors:
Francis Chu;Joseph Halpern;Johannes Gehrke
Affiliations:
Cornell University, Ithaca, NY;Cornell University, Ithaca, NY;Cornell University, Ithaca, NY
Venue:
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Year:
2002

Citing 11
Cited 15

Online aggregation

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Memory management during run generation in external sorting

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Least expected cost query optimization: an exercise in utility

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Principles of Database and Knowledge-Base Systems: Volume II: The New Technologies

Principles of Database and Knowledge-Base Systems: Volume II: The New Technologies
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Optimizing Queries Across Diverse Data Sources

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Hash Joins and Hash Teams in Microsoft SQL Server

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Cost Models DO Matter: Providing Cost Information for Diverse Data Sources in a Federated System

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
LEO - DB2's LEarning Optimizer

Proceedings of the 27th International Conference on Very Large Data Bases
Parametric Query Optimization

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Automating Statistics Management for Query Optimizers

ICDE '00 Proceedings of the 16th International Conference on Data Engineering

A characterization of the sensitivity of query optimization to storage access cost parameters

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Towards a robust query optimizer: a principled and practical approach

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Asking the right questions: model-driven optimization using probes

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Model-driven optimization using adaptive probes

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
On the production of anorexic plan diagrams

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Adaptive query processing

Foundations and Trends in Databases
Identifying robust plans through plan diagram reduction

Proceedings of the VLDB Endowment
Efficiently approximating query optimizer plan diagrams

Proceedings of the VLDB Endowment
Rethinking cost and performance of database systems

ACM SIGMOD Record
ROX: run-time optimization of XQueries

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Variance aware optimization of parameterized queries

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
How to probe for an extreme value

ACM Transactions on Algorithms (TALG)
Quantifying uncertainty in multi-dimensional cardinality estimations

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
The Picasso database query optimizer visualizer

Proceedings of the VLDB Endowment
Adaptive Uncertainty Resolution in Bayesian Combinatorial Optimization Problems

ACM Transactions on Algorithms (TALG)

Quantified Score

Hi-index	0.00

Visualization

Abstract

A standard assumption in the database query optimization literature is that it suffices to optimize for the "typical" case---that is, the case in which various parameters (e.g., the amount of available memory, the selectivities of predicates, etc.) take on their "typical" values. It was claimed in [CHS99] that we could do better by choosing plans based on their expected cost. Here we investigate this issue more thoroughly. We show that in many circumstances of interest, a "typical" value of the parameter often does give acceptable answers, provided that it is chosen carefully and we are interested only in minimizing expected running time. However, by minimizing the expected running time, we are effectively assuming that if plan p1 runs three times as long as plan p2, then p1 is exactly three times as bad as p2. An assumption like this is not always appropriate. We show that focusing on least expected cost can lead to significant improvement for a number of cost functions of interest.