Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Wavelet-based histograms for selectivity estimation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Independence is good: dependency-based histogram synopses for high-dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Combining Histograms and Parametric Curve Fitting for Feedback-Driven Query Result-size Estimation
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Optimizing Multi-Feature Queries for Image Databases
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Query Processing Issues in Image(Multimedia) Databases
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Optimal aggregation algorithms for middleware
Journal of Computer and System Sciences - Special issu on PODS 2001
Efficient top-K query calculation in distributed networks
Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
KLEE: a framework for distributed top-k query algorithms
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Wavelet synopses for general error metrics
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2004
The history of histograms (abridged)
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Efficient processing of distributed top-k queries
DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
Hi-index | 0.01 |
Accurate selectivity estimations are essential for query optimization decisions where they are typically derived from various kinds of histograms which condense value distributions into compact representations. The estimation accuracy of existing approaches typically varies across the domain, with some estimations being very accurate and some quite inaccurate. This is in particular unfortunate when performing a parametric search using these estimations, as the estimation artifacts can dominate the search results. We propose the usage of linear splines to construct histograms with known error guarantees across the whole continuous domain. These histograms are particularly well suited for using the estimates in parameter optimization. We show by a comprehensive performance evaluation using both synthetic and real world data that our approach clearly outperforms existing techniques.