A universal-scheme approach to statistical databases containing homogeneous summary tables
ACM Transactions on Database Systems (TODS)
Balancing histogram optimality and practicality for query result size estimation
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Histogram-based estimation techniques in database systems
Histogram-based estimation techniques in database systems
Wavelet-based histograms for selectivity estimation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Implications of certain assumptions in database performance evauation
ACM Transactions on Database Systems (TODS)
Optimal histograms for hierarchical range queries (extended abstract)
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Global optimization of histograms
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Maintaining stream statistics over sliding windows: (extended abstract)
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Dynamic multidimensional histograms
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Accurate estimation of the number of tuples satisfying a condition
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Using histograms to estimate answer sizes for XML queries
Information Systems - Special issue: Best papers from EDBT 2002
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Histogramming Data Streams with Fast Per-Item Processing
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Approximate query processing using wavelets
The VLDB Journal — The International Journal on Very Large Data Bases
Dynamic Histograms: Capturing Evolving Data Sets
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Extended wavelets for multiple measures
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Improving Range Query Estimation on Histograms
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Approximating a Data Stream for Querying and Estimation: Algorithms and Performance Evaluation
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Estimating selectivities in data bases
Estimating selectivities in data bases
Probabilistic wavelet synopses
ACM Transactions on Database Systems (TODS)
One-pass wavelet synopses for maximum-error metrics
VLDB '05 Proceedings of the 31st international conference on Very large data bases
The history of histograms (abridged)
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
REHIST: relative error histogram construction algorithms
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Optimality and scalability in lattice histogram construction
Proceedings of the VLDB Endowment
Approximating sliding windows by cyclic tree-like histograms for efficient range queries
Data & Knowledge Engineering
A quad-tree based multiresolution approach for two-dimensional summary data
Information Systems
Information Sciences: an International Journal
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
Indexing for summary queries: Theory and practice
ACM Transactions on Database Systems (TODS)
Hi-index | 0.00 |
Histograms are used to summarize the contents of relations into a number of buckets for the estimation of query result sizes. Several techniques have been proposed in the past for determining bucket boundaries which provide accurate estimations. However, while search strategies for optimal bucket boundaries are rather sophisticated, no much attention has been paid for estimating queries inside buckets and all of the above techniques adopt naive methods for such an estimation. This paper focuses on the problem of improving the estimation inside a bucket once its boundaries have been fixed. The proposed technique is based on the addition, to each bucket, of a memory-word additional information (organized into a tree-like index), storing approximate cumulative frequencies in a hierarchical fashion. Both theoretical analysis and experimental results show that the proposed approach improves the accuracy of the estimation inside buckets, w.r.t. both classical approaches (like continuous value assumption and uniform spread assumption) and a number of alternative ways to organize the additional information. The index is later added to state-of-the-art histograms obtaining the non-obvious result that despite the spatial overhead which reduces the number of allowed buckets once the storage space has been fixed, the original methods are strongly improved in terms of accuracy.