Filtered statistics

Authors:
Pawel Terlecki;Hardik Bati;Cesar Galindo-Legaria;Peter Zabback
Affiliations:
Microsoft Corp., Redmond, WA, USA;Microsoft Corp., Redmond, WA, USA;Microsoft Corp., Redmond, WA, USA;Microsoft Corp., Redmond, WA, USA
Venue:
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Year:
2009

Citing 10
Cited 0

Exploiting statistics on query expressions for optimization

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
A Framework for the Physical Design Problem for Data Synopses

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Conditional selectivity for statistics on query expressions

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
CORDS: automatic discovery of correlations and soft functional dependencies

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Consistent selectivity estimation via maximum entropy

The VLDB Journal — The International Journal on Very Large Data Bases
The history of histograms (abridged)

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Statistics on views

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Multi-tenant databases for software as a service: schema-mapping techniques

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Relational support for flexible schema scenarios

Proceedings of the VLDB Endowment
Filtered Indices and Their Use in Flexible Schema Scenarios

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Column statistics are an important element of cardinality estimation frameworks. More accurate estimates allow the optimizer of a RDBMS to generate better plans and improve the overall system's efficiency. This paper introduces filtered statistics, which model value distribution over a set of rows restricted by a predicate. This feature, available in Microsoft SQL Server, can be used to handle column correlation, as well as focus on interesting data ranges. In particular, it fits well for scenarios with logical subtables, like flexible schema or multi-tenant applications. Integration with the existing cardinality estimation infrastructure is presented.