Learning-based Query Performance Modeling and Prediction

Authors:
Mert Akdere;Ugur Çetintemel;Matteo Riondato;Eli Upfal;Stanley B. Zdonik
Affiliations:
-;-;-;-;-
Venue:
ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
Year:
2012

Citing 0
Cited 6

Robust estimation of resource consumption for SQL queries using statistical techniques

Proceedings of the VLDB Endowment
Parallel analytics as a service

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
DBMS metrology: measuring query time

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Workload management for big data analytics

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
How to exploit the device diversity and database interaction to propose a generic cost model?

Proceedings of the 17th International Database Engineering & Applications Symposium
Towards predicting query execution time for concurrent and dynamic database workloads

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

Accurate query performance prediction (QPP) is central to effective resource management, query optimization and query scheduling. Analytical cost models, used in current generation of query optimizers, have been successful in comparing the costs of alternative query plans, but they are poor predictors of execution latency. As a more promising approach to QPP, this paper studies the practicality and utility of sophisticated learning-based models, which have recently been applied to a variety of predictive tasks with great success, in both static (i.e., fixed) and dynamic query workloads. We propose and evaluate predictive modeling techniques that learn query execution behavior at different granularities, ranging from coarse-grained plan-level models to fine-grained operator-level models. We demonstrate that these two extremes offer a tradeoff between high accuracy for static workload queries and generality to unforeseen queries in dynamic workloads, respectively, and introduce a hybrid approach that combines their respective strengths by selectively composing them in the process of QPP. We discuss how we can use a training workload to (i) pre-build and materialize such models offline, so that they are readily available for future predictions, and (ii) build new models online as new predictions are needed. All prediction models are built using only static features (available prior to query execution) and the performance values obtained from the offline execution of the training workload. We fully implemented all these techniques and extensions on top of Postgre SQL and evaluated them experimentally by quantifying their effectiveness over analytical workloads, represented by well-established TPC-H data and queries. The results provide quantitative evidence that learning-based modeling for QPP is both feasible and effective for both static and dynamic workload scenarios.