A statistical approach towards robust progress estimation

Authors:
Arnd Christian König;Bolin Ding;Surajit Chaudhuri;Vivek Narasayya
Affiliations:
Microsoft Research, Redmond, WA;University of Illinois at Urbana-Champaign, Urbana, IL;Microsoft Research, Redmond, WA;Microsoft Research, Redmond, WA
Venue:
Proceedings of the VLDB Endowment
Year:
2011

Citing 14
Cited 1

Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Nested Loops Revisited

PDIS '93 Proceedings of the 2nd International Conference on Parallel and Distributed Information Systems
Toward a progress indicator for database queries

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Estimating progress of execution for SQL queries

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Increasing the Accuracy and Coverage of SQL Progress Indicators

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
When can we trust progress estimators for SQL queries?

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Learning to rank using gradient descent

ICML '05 Proceedings of the 22nd international conference on Machine learning
Execution strategies for SQL subqueries

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
ConEx: a system for monitoring queries

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
The design of a query monitoring system

ACM Transactions on Database Systems (TODS)
Predicting Multiple Metrics for Queries: Better Decisions Enabled by Machine Learning

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
ParaTimer: a progress indicator for MapReduce DAGs

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Performance prediction for concurrent database workloads

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Multi-query SQL progress indicators

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology

Workload management for big data analytics

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

The need for accurate SQL progress estimation in the context of decision support administration has led to a number of techniques proposed for this task. Unfortunately, no single one of these progress estimators behaves robustly across the variety of SQL queries encountered in practice, meaning that each technique performs poorly for a significant fraction of queries. This paper proposes a novel estimator selection framework that uses a statistical model to characterize the sets of conditions under which certain estimators outperform others, leading to a significant increase in estimation robustness. The generality of this framework also enables us to add a number of novel "special purpose" estimators which increase accuracy further. Most importantly, the resulting model generalizes well to queries very different from the ones used to train it. We validate our findings using a large number of industrial real-life and benchmark workloads.