Equi-depth multidimensional histograms
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Statistical profile estimation in database systems
ACM Computing Surveys (CSUR)
Practical selectivity estimation through adaptive sampling
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
On the propagation of errors in the size of join results
SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Experience from a real life query optimizer
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Sequential sampling procedures for query size estimation
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Optimal histograms for limiting worst-case error propagation in the size of join results
ACM Transactions on Database Systems (TODS)
Adaptive selectivity estimation using query feedback
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Implications of certain assumptions in database performance evauation
ACM Transactions on Database Systems (TODS)
A model of data distribution based on texture analysis
SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
A detailed statistical model for relational query optimization
ACM '85 Proceedings of the 1985 ACM annual conference on The range of computing : mid-80's perspective: mid-80's perspective
A Guide to DB2
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Estimating block transfers and join sizes
SIGMOD '83 Proceedings of the 1983 ACM SIGMOD international conference on Management of data
Accurate estimation of the number of tuples satisfying a condition
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Sampling-Based Selectivity Estimation for Joins Using Augmented Frequent Value Statistics
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
On B-Tree Indices for Skewed Distributions
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Universality of Serial Histograms
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
The optimization of queries in relational databases
The optimization of queries in relational databases
Estimating alphanumeric selectivity in the presence of wildcards
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
ACM Computing Surveys (CSUR)
The space complexity of approximating the frequency moments
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Efficient mid-query re-optimization of sub-optimal query execution plans
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
New sampling-based summary statistics for improving approximate query answers
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Random sampling for histogram construction: how much is enough?
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Optimization techniques for queries with expensive methods
ACM Transactions on Database Systems (TODS)
Iterated DFT based techniques for join size estimation
Proceedings of the seventh international conference on Information and knowledge management
Tracking join and self-join sizes in limited storage
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Substring selectivity estimation
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Multi-dimensional selectivity estimation using compressed histogram information
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
On approximating rectangle tiling and packing
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Optimal histograms for hierarchical range queries (extended abstract)
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Space efficient bitmap indexing
Proceedings of the ninth international conference on Information and knowledge management
STHoles: a multidimensional workload-aware histogram
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Global optimization of histograms
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Applying the golden rule of sampling for query estimation
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
ACM Transactions on Database Systems (TODS)
Fast algorithms for hierarchical range histogram construction
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Dynamic multidimensional histograms
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Querying Compressed Data in Data Warehouses
Information Technology and Management
A Hybrid Estimator for Selectivity Estimation
IEEE Transactions on Knowledge and Data Engineering
Automating Statistics Management for Query Optimizers
IEEE Transactions on Knowledge and Data Engineering
Using histograms to estimate answer sizes for XML queries
Information Systems - Special issue: Best papers from EDBT 2002
Estimating Answer Sizes for XML Queries
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
On Rectangular Partitionings in Two Dimensions: Algorithms, Complexity, and Applications
ICDT '99 Proceedings of the 7th International Conference on Database Theory
Estimating Range Queries Using Aggregate Data with Integrity Constraints: A Probabilistic Approach
ICDT '01 Proceedings of the 8th International Conference on Database Theory
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Selectivity Estimation in Extensible Databases - A Neural Network Approach
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Multi-Dimensional Substring Selectivity Estimation
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Combining Histograms and Parametric Curve Fitting for Feedback-Driven Query Result-size Estimation
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Approximate Query Processing: Taming the TeraBytes
Proceedings of the 27th International Conference on Very Large Data Bases
Estimation of Query-Result Distribution and its Application in Parallel-Join Load Balancing
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Modeling Skewed Distribution Using Multifractals and the `80-20' Law
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Selectivity Estimation Without the Attribute Value Independence Assumption
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Recovering Information from Summary Data
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
On Linear-Spline Based Histograms
WAIM '02 Proceedings of the Third International Conference on Advances in Web-Age Information Management
Compressed Datacubes for fast OLAP Applications
DaWaK '99 Proceedings of the First International Conference on Data Warehousing and Knowledge Discovery
Limiting Result Cardinalities for Multidatabase Queries Using Histograms
BNCOD 18 Proceedings of the 18th British National Conference on Databases: Advances in Databases
Summary Grids: Building Accurate Multidimensional Histograms
DASFAA '99 Proceedings of the Sixth International Conference on Database Systems for Advanced Applications
Binary-Tree Histograms with Tree Indices
DEXA '02 Proceedings of the 13th International Conference on Database and Expert Systems Applications
Mining Deviants in a Time Series Database
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
One-dimensional and multi-dimensional substring selectivity estimation
The VLDB Journal — The International Journal on Very Large Data Bases
Approximate query processing using wavelets
The VLDB Journal — The International Journal on Very Large Data Bases
What's hot and what's not: tracking most frequent items dynamically
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Utilizing histogram information
CASCON '01 Proceedings of the 2001 conference of the Centre for Advanced Studies on Collaborative research
IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
A multi-dimensional histogram for selectivity estimation and fast approximate query answering
CASCON '03 Proceedings of the 2003 conference of the Centre for Advanced Studies on Collaborative research
A new histogram method for sparse attributes: the averaged rectangular attribute cardinality map
ISICT '03 Proceedings of the 1st international symposium on Information and communication technologies
Query Size Estimation for Joins Using Systematic Sampling
Distributed and Parallel Databases
A Selectivity Model for Fragmented Relations: Applied in Information Retrieval
IEEE Transactions on Knowledge and Data Engineering
Selectivity Estimation for String Predicates: Overcoming the Underestimation Problem
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Estimating progress of execution for SQL queries
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Journal of Intelligent Information Systems
Energy efficient exact kNN search in wireless broadcast environments
Proceedings of the 12th annual ACM international workshop on Geographic information systems
Structure choices for two-dimensional histogram construction
CASCON '04 Proceedings of the 2004 conference of the Centre for Advanced Studies on Collaborative research
Projective Clustering by Histograms
IEEE Transactions on Knowledge and Data Engineering
Synopses for query optimization: a space-complexity perspective
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximation algorithms for array partitioning problems
Journal of Algorithms
What's hot and what's not: tracking most frequent items dynamically
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
Histograms revisited: when are histograms the best approximation method for aggregates over joins?
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
When can we trust progress estimators for SQL queries?
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Hierarchical binary histograms for summarizing multi-dimensional data
Proceedings of the 2005 ACM symposium on Applied computing
Space efficiency in synopsis construction algorithms
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Hubble: an advanced dynamic folder technology for XML
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Using Datacube Aggregates for Approximate Querying and Deviation Detection
IEEE Transactions on Knowledge and Data Engineering
Synopses for query optimization: A space-complexity perspective
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2004
Approximation and streaming algorithms for histogram construction problems
ACM Transactions on Database Systems (TODS)
Answering queries using materialized views with minimum size
The VLDB Journal — The International Journal on Very Large Data Bases
Journal of Intelligent Information Systems
Compact histograms for hierarchical identifiers
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more
Error minimization in approximate range aggregates
Data & Knowledge Engineering
A Note on Linear Time Algorithms for Maximum Error Histograms
IEEE Transactions on Knowledge and Data Engineering
Selectivity estimation by batch-query based histogram and parametric method
ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
A time machine for text search
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Searching on the secondary structure of protein sequences
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
The history of histograms (abridged)
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
REHIST: relative error histogram construction algorithms
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Rk-hist: an r-tree based histogram for multi-dimensional selectivity estimation
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Histograms based on the minimum description length principle
The VLDB Journal — The International Journal on Very Large Data Bases
Analytic-based estimation of query result sizes
AIKED'05 Proceedings of the 4th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering Data Bases
Accurate histogram-based XML summarization
Proceedings of the 2008 ACM symposium on Applied computing
DAWN: an efficient framework of DCT for data with error estimation
The VLDB Journal — The International Journal on Very Large Data Bases
Hierarchical synopses with optimal error guarantees
ACM Transactions on Database Systems (TODS)
Enhancing histograms by tree-like bucket indices
The VLDB Journal — The International Journal on Very Large Data Bases
Compressed hierarchical binary histograms for summarizing multi-dimensional data
Knowledge and Information Systems
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient top-k processing over query-dependent functions
Proceedings of the VLDB Endowment
The design of a query monitoring system
ACM Transactions on Database Systems (TODS)
Optimal splitters for database partitioning with size bounds
Proceedings of the 12th International Conference on Database Theory
AMID: Approximation of MultI-measured Data using SVD
Information Sciences: an International Journal
Multi-dimensional data density estimation in P2P networks
Distributed and Parallel Databases
Fast and effective histogram construction
Proceedings of the 18th ACM conference on Information and knowledge management
Statistical structures for Internet-scale data management
The VLDB Journal — The International Journal on Very Large Data Bases
Optimality and scalability in lattice histogram construction
Proceedings of the VLDB Endowment
Consistent histograms in the presence of distinct value counts
Proceedings of the VLDB Endowment
Splash: ad-hoc querying of data and statistical models
Proceedings of the 13th International Conference on Extending Database Technology
A statistics propagation approach to enable cost-based optimization of statement sequences
ADBIS'07 Proceedings of the 11th East European conference on Advances in databases and information systems
New methods for deviation-based outlier detection in large database
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Histograms reloaded: the merits of bucket diversity
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Deriving predicate statistics in datalog
Proceedings of the 12th international ACM SIGPLAN symposium on Principles and practice of declarative programming
A quad-tree based multiresolution approach for two-dimensional summary data
Information Systems
The VC-dimension of SQL queries and selectivity estimation through sampling
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Workload-optimal histograms on streams
ESA'05 Proceedings of the 13th annual European conference on Algorithms
Information Sciences: an International Journal
Clustering-based histograms for multi-dimensional data
DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
Estimating the overlapping area of polygon join
SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
Estimating aggregate join queries over data streams using discrete cosine transform
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Processing count queries over event streams at multiple time granularities
Information Sciences: an International Journal
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
Histograms as statistical estimators for aggregate queries
Information Systems
Deriving predicate statistics for logic rules
RR'12 Proceedings of the 6th international conference on Web Reasoning and Rule Systems
Non-termination analysis and cost-based query optimization of logic programs
RR'12 Proceedings of the 6th international conference on Web Reasoning and Rule Systems
Efficiently adapting graphical models for selectivity estimation
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient and scalable monitoring and summarization of large probabilistic data
Proceedings of the 2013 Sigmod/PODS Ph.D. symposium on PhD symposium
Entropy-based histograms for selectivity estimation
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
Many current database systems use histograms to approximate the frequency distribution of values in the attributes of relations and based on them estimate query result sizes and access plan costs. In choosing among the various histograms, one has to balance between two conflicting goals: optimality, so that generated estimates have the least error, and practicality, so that histograms can be constructed and maintained efficiently. In this paper, we present both theoretical and experimental results on several issues related to this trade-off. Our overall conclusion is that the most effective approach is to focus on the class of histograms that accurately maintain the frequencies of a few attribute values and assume the uniform distribution for the rest, and choose for each relation the histogram in that class that is optimal for a self-join query.