A characterization of the sensitivity of query optimization to storage access cost parameters

  • Authors:
  • Frederick R. Reiss;Tapas Kanungo

  • Affiliations:
  • University of California, Berkeley, CA;IBM Almaden Research Center, San Jose, CA

  • Venue:
  • Proceedings of the 2003 ACM SIGMOD international conference on Management of data
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most relational query optimizers make use of information about the costs of accessing tuples and data structures on various storage devices. This information can at times be off by several orders of magnitude due to human error in configuration setup, sudden changes in load, or hardware failure. In this paper, we attempt to answer the following questions:• Are inaccurate access cost estimates likely to cause a typical query optimizer to choose a suboptimal query plan?• If an optimizer chooses a suboptimal plan as a result of inaccurate access cost estimates, how far from optimal is this plan likely to be?To address these issues, we provide a theoretical, vector-based framework for analyzing the costs of query plans under various storage parameter costs. We then use this geometric framework to characterize experimentally a commercial query optimizer. We develop algorithms for extracting detailed information about query plans through narrow optimizer interfaces, and we perform the characterization using database statistics from a published run of the TPC-H benchmark and a wide range of storage parameters.We show that, when data structures such as tables, indexes, and sorted runs reside on different storage devices, the optimizer can derive significant benefits from having accurate and timely information regarding the cost of accessing storage devices.