Equi-depth multidimensional histograms
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Optimal histograms for limiting worst-case error propagation in the size of join results
ACM Transactions on Database Systems (TODS)
Balancing histogram optimality and practicality for query result size estimation
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Histogram-based estimation techniques in database systems
Histogram-based estimation techniques in database systems
Wavelet-based histograms for selectivity estimation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Selectivity estimation in spatial databases
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Self-tuning histograms: building histograms without looking at data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Multi-dimensional selectivity estimation using compressed histogram information
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Approximating multi-dimensional aggregate range queries over real attributes
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
STHoles: a multidimensional workload-aware histogram
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Accurate estimation of the number of tuples satisfying a condition
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Histogram-Based Approximation of Set-Valued Query-Answers
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Approximate Query Processing Using Wavelets
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Universality of Serial Histograms
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Selectivity Estimation Without the Attribute Value Independence Assumption
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Utilizing histogram information
CASCON '01 Proceedings of the 2001 conference of the Centre for Advanced Studies on Collaborative research
The optimization of queries in relational databases
The optimization of queries in relational databases
A multi-dimensional histogram for selectivity estimation and fast approximate query answering
CASCON '03 Proceedings of the 2003 conference of the Centre for Advanced Studies on Collaborative research
AQUAGP: approximate QUery answers using genetic programming
EuroGP'06 Proceedings of the 9th European conference on Genetic Programming
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
Hi-index | 0.00 |
Histograms of the distributions of individual attributes are currently used in leading database management systems (e.g., IBM DB2, Oracle Database, and Microsoft SQL server). Because attribute pairs in databases are seldom independent, however, the use of the distributions of individual attributes with the attribute independence assumption often leads to poor estimates. More accurate answers can be obtained by using multi-dimensional histograms to characterize the joint distribution of two or more attributes. When moving from one-dimensional to two-dimensional histograms, several new issues relating to histogram structure arise: (1) Which attribute should take priority over the other with respect to data partitioning?; (2) Into how many partitions should each dimension be split to obtain a desired number of histogram buckets?; and (3) How many most frequent values should be isolated and stored in singleton buckets? In the context of real data, we experimentally show that our proposed methods for dealing with histogram structure choices lead to good quality histograms for a variety of histogram partitioning techniques and various types of data distributions.