Predicting query execution time: Are optimizer cost models really unusable?

  • Authors:
  • Hakan Hacigumus;Yun Chi;Wentao Wu;Shenghuo Zhu;Junichi Tatemura;Jeffrey F. Naughton

  • Affiliations:
  • NEC Laboratories America, Cupertino, CA, USA;NEC Laboratories America, Cupertino, CA, USA;Computer Sciences Department, University of Wisconsin, Madison, WI, USA;NEC Laboratories America, Cupertino, CA, USA;NEC Laboratories America, Cupertino, CA, USA;Computer Sciences Department, University of Wisconsin, Madison, WI, USA

  • Venue:
  • ICDE '13 Proceedings of the 2013 IEEE International Conference on Data Engineering (ICDE 2013)
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Predicting query execution time is useful in many database management issues including admission control, query scheduling, progress monitoring, and system sizing. Recently the research community has been exploring the use of statistical machine learning approaches to build predictive models for this task. An implicit assumption behind this work is that the cost models used by query optimizers are insufficient for query execution time prediction. In this paper we challenge this assumption and show while the simple approach of scaling the optimizer's estimated cost indeed fails, a properly calibrated optimizer cost model is surprisingly effective. However, even a well-tuned optimizer cost model will fail in the presence of errors in cardinality estimates. Accordingly we investigate the novel idea of spending extra resources to refine estimates for the query plan after it has been chosen by the optimizer but before execution. In our experiments we find that a well calibrated query optimizer model along with cardinality estimation refinement provides a low overhead way to provide estimates that are always competitive and often much better than the best reported numbers from the machine learning approaches.