Randomized algorithms for optimizing large join queries

  • Authors:
  • Y. E. Ioannidis;Younkyung Kang

  • Affiliations:
  • Computer Sciences Department, University of Wisconsin, Madison, WI;Computer Sciences Department, University of Wisconsin, Madison, WI

  • Venue:
  • SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
  • Year:
  • 1990

Quantified Score

Hi-index 0.00

Visualization

Abstract

Query optimization for relational database systems is a combinatorial optimization problem, which makes exhaustive search unacceptable as the query size grows. Randomized algorithms, such as Simulated Annealing (SA) and Iterative Improvement (II), are viable alternatives to exhaustive search. We have adapted these algorithms to the optimization of project-select-join queries. We have tested them on large queries of various types with different databases, concluding that in most cases SA identifies a lower cost access plan than II. To explain this result, we have studied the shape of the cost function over the solution space associated with such queries and we have conjectured that it resembles a 'cup' with relatively small variations at the bottom. This has inspired a new Two Phase Optimization algorithm, which is a combination of Simulated Annealing and Iterative Improvement. Experimental results show that Two Phase Optimization outperforms the original algorithms in terms of both output quality and running time.