Preventing bad plans by bounding the impact of cardinality estimation errors

Authors:
Guido Moerkotte;Thomas Neumann;Gabriele Steidl
Affiliations:
University of Mannheim, Mannheim, Germany;Max Planck Institute for Informatics, Saarbrücken, Germany;University of Mannheim, Mannheim, Germany
Venue:
Proceedings of the VLDB Endowment
Year:
2009

Citing 13
Cited 3

On the optimal nesting order for computing N-relational joins

ACM Transactions on Database Systems (TODS)
On the propagation of errors in the size of join results

SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Improved histograms for selectivity estimation of range predicates

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Wavelet-based histograms for selectivity estimation

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Towards estimation error guarantees for distinct values

PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Optimal Histograms with Quality Guarantees

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Optimization of Nonrecursive Queries

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Combining Histograms and Parametric Curve Fitting for Feedback-Driven Query Result-size Estimation

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports

Proceedings of the 27th International Conference on Very Large Data Bases
Sing the truth about ad hoc join costs

The VLDB Journal — The International Journal on Very Large Data Bases
Analyzing plan diagrams of database query optimizers

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Wavelet synopses for general error metrics

ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2004
The history of histograms (abridged)

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29

Histograms reloaded: the merits of bucket diversity

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches

Foundations and Trends in Databases
Efficiently adapting graphical models for selectivity estimation

The VLDB Journal — The International Journal on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Query optimizers rely on accurate estimations of the sizes of intermediate results. Wrong size estimations can lead to overly expensive execution plans. We first define the q-error to measure deviations of size estimates from actual sizes. The q-error enables the derivation of two important results: (1) We provide bounds such that if the q-error is smaller than this bound, the query optimizer constructs an optimal plan. (2) If the q-error is bounded by a number q, we show that the cost of the produced plan is at most a factor of q4 worse than the optimal plan. Motivated by these findings, we next show how to find the best approximation under the q-error. These techniques can then be used to build synopsis for size estimates. Finally, we give some experimental results where we apply the developed techniques.