Equi-depth multidimensional histograms
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Elements of information theory
Elements of information theory
Numerical recipes in C (2nd ed.): the art of scientific computing
Numerical recipes in C (2nd ed.): the art of scientific computing
Balancing histogram optimality and practicality for query result size estimation
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
The space complexity of approximating the frequency moments
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Range queries in OLAP data cubes
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Wavelet-based histograms for selectivity estimation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Approximate computation of multidimensional aggregates of sparse data using wavelets
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Approximating multi-dimensional aggregate range queries over real attributes
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Optimal and approximate computation of summary statistics for range aggregates
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Independence is good: dependency-based histogram synopses for high-dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
STHoles: a multidimensional workload-aware histogram
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Fast, small-space algorithms for approximate histogram maintenance
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Exploiting statistics on query expressions for optimization
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Dynamic multidimensional histograms
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Wavelet synopses with error guarantees
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals
Data Mining and Knowledge Discovery
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Histogram-Based Approximation of Set-Valued Query-Answers
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Combining Histograms and Parametric Curve Fitting for Feedback-Driven Query Result-size Estimation
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Approximate Query Processing Using Wavelets
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Modeling Skewed Distribution Using Multifractals and the `80-20' Law
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Selectivity Estimation Without the Attribute Value Independence Assumption
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Mining Deviants in a Time Series Database
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Relative Prefix Sums: An Efficient Approach for Querying Dynamic OLAP Data Cubes
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Improving Range Query Estimation on Histograms
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Concise and accurate data summaries for fast approximate query answering
Concise and accurate data summaries for fast approximate query answering
The history of histograms (abridged)
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
REHIST: relative error histogram construction algorithms
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Optimality and scalability in lattice histogram construction
Proceedings of the VLDB Endowment
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
Sensitivity of self-tuning histograms: query order affecting accuracy and robustness
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Data summarization ontology-based query processing
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Histograms have been widely used for selectivity estimation in query optimization, as well as for fast approximate query answering in many OLAP, data mining, and data visualization applications. This paper presents a new family of histograms, the Hierarchical Model Fitting (HMF) histograms, based on the Minimum Description Length principle. Rather than having each bucket of a histogram described by the same type of model, the HMF histograms employ a local optimal model for each bucket. The improved effectiveness of the locally chosen models offsets more than the overhead of keeping track of the representation of each individual bucket. Through a set of experiments, we show that the HMF histograms are capable of providing more accurate approximations than previously proposed techniques for many real and synthetic data sets across a variety of query workloads.