Targeted genetic test SQL generation for the DB2 database

Authors:
Dominic Letarte;Francois Gauthier;Ettore Merlo;Nattavut Sutyanyong;Calisto Zuzarte
Affiliations:
École Polytechnique de Montréal;École Polytechnique de Montréal;École Polytechnique de Montréal;IBM Canada Ltd.;IBM Canada Ltd.
Venue:
DBTest '12 Proceedings of the Fifth International Workshop on Testing Database Systems
Year:
2012

Citing 7
Cited 1

Extensible/rule based query rewrite optimization in Starburst

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Massive Stochastic Testing of SQL

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Generating Queries with Cardinality Constraints for DBMS Testing

IEEE Transactions on Knowledge and Data Engineering
QAGen: generating query-aware test databases

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
A genetic approach for random testing of database systems

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Generating targeted queries for database testing

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A framework for testing query transformation rules

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data

Data debugging with continuous testing

Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic Query generators have been shown to be effective tools for software testing. For the most part, they have been used in system testing for the database as a whole or to generate specific queries to test specific features with not much randomness. In this work we explore the problems encountered when using a genetic algorithm to generate SQL for testing a large database system. General random SQL generation that tests the database system as a whole using genetic algorithms is relatively simple. One would need to generate millions of test cases to have a reasonable chance of hitting specific combinations of features. In order to optimize the testing, one needs to generate targeted SQL queries that narrow the testing to specific feature areas and feature combinations but yet preserve a certain amount of randomness and exploit the strength of a genetic algorithm. To do this effectively, the test generator needs to be guided so that it does not stray too much from the goals of the more targeted test requirement. In this work we explore a genetic algorithm approach to generate test queries that exercise target sub-sequences of features. Genetic algorithm parameters such as genome representation, reproduction, fitness evaluation, and selection are described. Preliminary results obtained comparing the presented approach with a random query generator are presented and discussed. We further present the DB2 SQL Query Optimizer, the application which we are using as a case study and target queries that go through certain optimization rule sequences. This application is larger and more complex in terms of code size and data input complexity then software previously used for studying test data generation.