Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
PDIS '93 Proceedings of the 2nd International Conference on Parallel and Distributed Information Systems
Toward a progress indicator for database queries
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Estimating progress of execution for SQL queries
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Increasing the Accuracy and Coverage of SQL Progress Indicators
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
When can we trust progress estimators for SQL queries?
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Learning to rank using gradient descent
ICML '05 Proceedings of the 22nd international conference on Machine learning
Execution strategies for SQL subqueries
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
ConEx: a system for monitoring queries
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
The design of a query monitoring system
ACM Transactions on Database Systems (TODS)
Predicting Multiple Metrics for Queries: Better Decisions Enabled by Machine Learning
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
ParaTimer: a progress indicator for MapReduce DAGs
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Performance prediction for concurrent database workloads
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Multi-query SQL progress indicators
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Workload management for big data analytics
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Hi-index | 0.00 |
The need for accurate SQL progress estimation in the context of decision support administration has led to a number of techniques proposed for this task. Unfortunately, no single one of these progress estimators behaves robustly across the variety of SQL queries encountered in practice, meaning that each technique performs poorly for a significant fraction of queries. This paper proposes a novel estimator selection framework that uses a statistical model to characterize the sets of conditions under which certain estimators outperform others, leading to a significant increase in estimation robustness. The generality of this framework also enables us to add a number of novel "special purpose" estimators which increase accuracy further. Most importantly, the resulting model generalizes well to queries very different from the ones used to train it. We validate our findings using a large number of industrial real-life and benchmark workloads.