LEO: An autonomic query optimizer for DB2

Authors:
V. Markl;G. M. Lohman;V. Raman
Affiliations:
IBM Research Division, Almaden Research Center, 650 Harry Road, San Jose, California 95120;IBM Research Division, Almaden Research Center, 650 Harry Road, San Jose, California 95120;IBM Research Division, Almaden Research Center, 650 Harry Road, San Jose, California 95120
Venue:
IBM Systems Journal
Year:
2003

Citing 16
Cited 20

On estimating the cardinality of the projection of a database relation

ACM Transactions on Database Systems (TODS)
On the propagation of errors in the size of join results

SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Multiple join size estimation by virtual domains (extended abstract)

PODS '93 Proceedings of the twelfth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
On the estimation of join result sizes

EDBT '94 Proceedings of the 4th international conference on extending database technology: Advances in database technology
Improved histograms for selectivity estimation of range predicates

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Materialized views and data warehouses

ACM SIGMOD Record
Efficient mid-query re-optimization of sub-optimal query execution plans

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Cost-based query scrambling for initial delays

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Answering complex SQL queries using automatic summary tables

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Learning table access cardinalities with LEO

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Optimizing Queries with Materialized Views

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Selectivity Estimation and Query Optimization in Large Databases with Highly Skewed Distribution of Column Values

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
A Formal Perspective on the View Selection Problem

Proceedings of the 27th International Conference on Very Large Data Bases
LEO - DB2's LEarning Optimizer

Proceedings of the 27th International Conference on Very Large Data Bases
Selectivity Estimation Without the Attribute Value Independence Assumption

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases

The dawning of the autonomic computing era

IBM Systems Journal
Autonomic Web-Based Simulation

ANSS '05 Proceedings of the 38th annual Symposium on Simulation
Goals and benchmarks for autonomic configuration recommenders

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Making database systems usable

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
A framework for enforcing application policies in database systems

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Automatic SQL tuning in oracle 10g

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
A survey of autonomic computing—degrees, models, and applications

ACM Computing Surveys (CSUR)
Optimizer plan change management: improved stability and performance in Oracle 11g

Proceedings of the VLDB Endowment
Architecture of a Database System

Foundations and Trends in Databases
Dynamic plan generation for parameterized queries

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
An Ontology-Based Autonomic System for Improving Data Warehouse Performances

KES '09 Proceedings of the 13th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems: Part I
Automation everywhere: autonomics and data management

BNCOD'07 Proceedings of the 24th British national conference on Databases
Online monitoring and visualisation of database structural deterioration

International Journal of Autonomic Computing
Improving architecture-based self-adaptation using preemption

SOAR'09 Proceedings of the First international conference on Self-organizing architectures
A bayesian approach to online performance modeling for database appliances using gaussian models

Proceedings of the 8th ACM international conference on Autonomic computing
New algorithms for join and grouping operations

Computer Science - Research and Development
Subquadratic algorithms for workload-aware haar wavelet synopses

FSTTCS '05 Proceedings of the 25th international conference on Foundations of Software Technology and Theoretical Computer Science
Making self-adaptation an engineering reality

Self-star Properties in Complex Information Systems
Organic databases

DNIS'11 Proceedings of the 7th international conference on Databases in Networked Information Systems
Chimera: a declarative language for streaming network traffic analysis

Security'12 Proceedings of the 21st USENIX conference on Security symposium

Quantified Score

Hi-index	0.00

Visualization

Abstract

Structured Query Language (SQL) has emerged as an industry standard for querying relational database management systems, largely because a user need only specify what data are wanted, not the details of how to access those data. A query optimizer uses a mathematical model of query execution to determine automatically the best way to access and process any given SQL query. This model is heavily dependent upon the optimizer's estimates for the number of rows that will result at each step of the query execution plan (QEP), especially for complex queries involving many predicates and/or operations. These estimates rely upon statistics on the database and modeling assumptions that may or may not be true for a given database. In this paper, we discuss an autonomic query optimizer that automatically self-validates its model without requiring any user interaction to repair incorrect statistics or cardinality estimates. By monitoring queries as they execute, the autonomic optimizer compares the optimizer's estimates with actual cardinalities at each step in a QEP, and computes adjustments to its estimates that may be used during future optimizations of similar queries. Moreover, the detection of estimation errors can also trigger reoptimization of a query in mid-execution. The autonomic refinement of the optimizer's model can result in a reduction of query execution time by orders of magnitude at negligible additional run-time cost. We discuss various research issues and practical considerations that were addressed during our implementation of a first prototype of LEO, a LEarning Optimizer for DB2脗® (Database 2TM) that learns table access cardinalities and for future queries corrects the estimation error for simple predicates by adjusting the database statistics of DB2.