Testing cardinality estimation models in SQL server

Authors:
Campbell Fraser;Leo Giakoumakis;Vikas Hamine;Katherine F. Moore-Smith
Affiliations:
Microsoft Corporation, Redmond, WA;Microsoft Corporation, Redmond, WA;Microsoft Corporation, Redmond, WA;Microsoft Corporation, Redmond, WA
Venue:
DBTest '12 Proceedings of the Fifth International Workshop on Testing Database Systems
Year:
2012

Citing 8
Cited 0

On the estimation of join result sizes

EDBT '94 Proceedings of the 4th international conference on extending database technology: Advances in database technology
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Analyzing plan diagrams of database query optimizers

VLDB '05 Proceedings of the 31st international conference on Very large data bases
End-biased Samples for Join Cardinality Estimation

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
The history of histograms (abridged)

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A genetic approach for random testing of database systems

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Testing challenges for extending SQL server's query processor: a case study

Proceedings of the 1st international workshop on Testing database systems
Testing on a budget: integrating e-business certification into the Oracle DBMS testing

Proceedings of the Second International Workshop on Testing Database Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reliable query optimization greatly depends on accurate Cardinality Estimation (CE), which is inherently inexact as it relies on statistical information. In commercial database systems, cardinality estimation models are sophisticated components that over years of development can become very complex. The code that implements cardinality estimation models, like most complex software systems that handle a large space of possible inputs and conditions, can deviate from its original architecture and design points over time. Hence, it is often necessary to refactor and redesign the entire system to accommodate new inputs and conditions, and also to reflect existing ones in a more intentional way. In this paper, we describe such an exercise: the replacement and validation of a new cardinality estimation model in Microsoft SQL Server. We describe the motivation behind this change, and provide a high level sketch of the empirical methods used to ensure that the new cardinality estimation model satisfies its goals while minimizing the potential risk of plan regressions for existing customers.