Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Counting, enumerating, and sampling of execution plans in a cost-based query optimizer
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Massive Stochastic Testing of SQL
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
Practical Skew Handling in Parallel Joins
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Improved Unnesting Algorithms for Join Aggregate SQL Queries
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Analyzing plan diagrams of database query optimizers
VLDB '05 Proceedings of the 31st international conference on Very large data bases
A Safe Regression Test Selection Technique for Database-Driven Applications
ICSM '05 Proceedings of the 21st IEEE International Conference on Software Maintenance
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
A framework for efficient regression tests on database applications
The VLDB Journal — The International Journal on Very Large Data Bases
Generating thousand benchmark queries in seconds
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Why you should run TPC-DS: a workload analysis
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
A genetic approach for random testing of database systems
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
CODD: constructing dataless databases
DBTest '12 Proceedings of the Fifth International Workshop on Testing Database Systems
Hi-index | 0.00 |
Query optimizers play a critical role in the success of every relational database system. However, regression testing for optimizers remains an ad hoc, tedious, and time consuming process. Typically, a large number of SQL query suites are employed for regression testing. These suites are manually designed at great cost by development and QA groups or are collected from various customers or benchmarks such as TPC-H or TPC-DS. While these suites are useful in capturing regressions, optimizers continue to be plagued by regressions and bug fixing requiring expensive human intervention. This may be because these ad-hoc regression queries are redundant in the sense that they are not covering different parts of the optimizer plan space. This paper introduces a novel way in which the optimizer itself is used to generate an economical regression suite. Our approach eliminates the tedium in manually designing a regression suite and removes redundancy in the suite. As a first step towards solving this very difficult problem, we shall focus on the join plan space in this paper with a small number of tables. We show that our generated queries exhibit 50% more distinct join plans than TPC-H and TPC-DS combined. The generated queries have also been very useful for validating the optimizer's cost functions and hence can be used as a test suite as well. Since this is a new approach, we will highlight some of the areas that need a closer look by the research community.