Selectivity Estimation in the Presence of Alphanumeric Correlations

Authors:
Min Wang;Jeffrey Scott Vitter;Balakrishna R. Iyer
Affiliations:
-;-;-
Venue:
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Year:
1997

Citing 0
Cited 10

Selectively estimation for Boolean queries

PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Algorithmics and applications of tree and graph searching

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Using histograms to estimate answer sizes for XML queries

Information Systems - Special issue: Best papers from EDBT 2002
Estimating Answer Sizes for XML Queries

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Multi-Dimensional Substring Selectivity Estimation

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
One-dimensional and multi-dimensional substring selectivity estimation

The VLDB Journal — The International Journal on Very Large Data Bases
Generalized substring selectivity estimation

Journal of Computer and System Sciences - Special issue on PODS 2000
CXHist: an on-line classification-based histogram for XML string selectivity estimation

VLDB '05 Proceedings of the 31st international conference on Very large data bases
The history of histograms (abridged)

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
The VC-dimension of SQL queries and selectivity estimation through sampling

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Query optimization is an integral part of relational database management systems. One important task in query optimization is selectivity estimation, that is, given a query P, we need to estimate the fraction of records in the database that satisfy P. Almost all previous work dealt with the estimation of numeric selectivity, i.e., the query contains only numeric variables. The general problem of estimating alphanumeric selectivity is much more difficult and has attracted attention only very recently, and the focus has been on the special case when only one column is involved. In this paper, we consider the more general case when there are two correlated alphanumeric columns. We develop efficient algorithms to build storage structures that can fit in a database catalog. Results from our extensive experiments to test our algorithms, on the basis of error analysis and space requirements, are given to guide DBMS implementors.