Equi-depth multidimensional histograms
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Practical selectivity estimation through adaptive sampling
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Adaptive selectivity estimation using query feedback
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Efficient mid-query re-optimization of sub-optimal query execution plans
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Approximate medians and other quantiles in one pass and with limited memory
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Wavelet-based histograms for selectivity estimation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
The Grid File: An Adaptable, Symmetric Multikey File Structure
ACM Transactions on Database Systems (TODS)
Selectivity Estimation Without the Attribute Value Independence Assumption
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Fast Incremental Maintenance of Approximate Histograms
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
The optimization of queries in relational databases
The optimization of queries in relational databases
Optimal histograms for hierarchical range queries (extended abstract)
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximating multi-dimensional aggregate range queries over real attributes
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
STHoles: a multidimensional workload-aware histogram
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Fast, small-space algorithms for approximate histogram maintenance
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Exploiting statistics on query expressions for optimization
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Dynamic multidimensional histograms
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Top-k selection queries over relational databases: Mapping strategies and performance evaluation
ACM Transactions on Database Systems (TODS)
Fast incremental maintenance of approximate histograms
ACM Transactions on Database Systems (TODS)
RHist: adaptive summarization over continuous data streams
Proceedings of the eleventh international conference on Information and knowledge management
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Supporting Efficient Parametric Search of E-Commerce Data: A Loosely-Coupled Solution
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Dynamic Maintenance of Wavelet-Based Histograms
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries
Proceedings of the 27th International Conference on Very Large Data Bases
LEO - DB2's LEarning Optimizer
Proceedings of the 27th International Conference on Very Large Data Bases
Approximate Query Processing: Taming the TeraBytes
Proceedings of the 27th International Conference on Very Large Data Bases
Managing and analyzing massive data sets with data cubes
Handbook of massive data sets
3D visual data mining: goals and experiences
Computational Statistics & Data Analysis - Data visualization
Quality of service in an information economy
ACM Transactions on Internet Technology (TOIT)
A multi-dimensional histogram for selectivity estimation and fast approximate query answering
CASCON '03 Proceedings of the 2003 conference of the Centre for Advanced Studies on Collaborative research
Querying about the Past, the Present, and the Future in Spatio-Temporal Databases
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Selectivity Estimation for String Predicates: Overcoming the Underestimation Problem
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Conditional selectivity for statistics on query expressions
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
CORDS: automatic discovery of correlations and soft functional dependencies
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Toward a progress indicator for database queries
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Estimating progress of execution for SQL queries
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Journal of Intelligent Information Systems
Structure choices for two-dimensional histogram construction
CASCON '04 Proceedings of the 2004 conference of the Centre for Advanced Studies on Collaborative research
Approximation algorithms for array partitioning problems
Journal of Algorithms
Selectivity estimators for multidimensional range queries over real attributes
The VLDB Journal — The International Journal on Very Large Data Bases
Consistently estimating the selectivity of conjuncts of predicates
VLDB '05 Proceedings of the 31st international conference on Very large data bases
VLDB '05 Proceedings of the 31st international conference on Very large data bases
CXHist: an on-line classification-based histogram for XML string selectivity estimation
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Self-tuning cost modeling of user-defined functions in an object-relational DBMS
ACM Transactions on Database Systems (TODS)
Graph-based synopses for relational selectivity estimation
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Efficient detection of empty-result queries
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Consistent selectivity estimation via maximum entropy
The VLDB Journal — The International Journal on Very Large Data Bases
A study on workload-aware wavelet synopses for point and range-sum queries
DOLAP '06 Proceedings of the 9th ACM international workshop on Data warehousing and OLAP
Query result ranking over e-commerce web databases
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Estimating query result sizes for proxy caching in scientific database federations
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Compressed histograms with arbitrary bucket layouts for selectivity estimation
Information Sciences: an International Journal
Optimal workload-based weighted wavelet synopses
Theoretical Computer Science
Approximate range---sum query answering on data cubes with probabilistic guarantees
Journal of Intelligent Information Systems
Selectivity estimation by batch-query based histogram and parametric method
ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
XPathLearner: an on-line self-tuning Markov histogram for XML path selectivity estimation
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
The history of histograms (abridged)
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
SASH: a self-adaptive histogram set for dynamically changing workloads
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Primitives for workload summarization and implications for SQL
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Automated statistics collection in DB2 UDB
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Self-tuning database systems: a decade of progress
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Peer-to-peer similarity search in metric spaces
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Workload-based generation of administrator hints for optimizing database storage utilization
ACM Transactions on Storage (TOS)
Foundations and Trends in Databases
Proactive and reactive multi-dimensional histogram maintenance for selectivity estimation
Journal of Systems and Software
Robustness in automatic physical database design
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Hierarchical synopses with optimal error guarantees
ACM Transactions on Database Systems (TODS)
Workload-Aware Histograms for Remote Applications
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Identifying robust plans through plan diagram reduction
Proceedings of the VLDB Endowment
A pay-as-you-go framework for query execution feedback
Proceedings of the VLDB Endowment
A new approach to building histogram for selectivity estimation in query processing optimization
Computers & Mathematics with Applications
TuG synopses for approximate query answering
ACM Transactions on Database Systems (TODS)
Multiplicative synopses for relative-error metrics
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Query optimizers: time to rethink the contract?
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Maintenance strategies for routing indexes
Distributed and Parallel Databases
Statistical structures for Internet-scale data management
The VLDB Journal — The International Journal on Very Large Data Bases
Optimality and scalability in lattice histogram construction
Proceedings of the VLDB Endowment
Consistent histograms in the presence of distinct value counts
Proceedings of the VLDB Endowment
Warm cache costing: a feedback optimization technique for buffer pool aware costing
Proceedings of the 13th International Conference on Extending Database Technology
Getting qualified answers for aggregate queries in spatio-temporal databases
APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Journal of Intelligent Information Systems
A statistics propagation approach to enable cost-based optimization of statement sequences
ADBIS'07 Proceedings of the 11th East European conference on Advances in databases and information systems
A secure multi-dimensional partition based index in DAS
APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Towards elastic transactional cloud storage with range query support
Proceedings of the VLDB Endowment
ACM Transactions on Database Systems (TODS)
Efficient selectivity estimation by histogram construction based on subspace clustering
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Self-adaptive statistics management for efficient query processing
WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
Workload-optimal histograms on streams
ESA'05 Proceedings of the 13th annual European conference on Algorithms
HASE: a hybrid approach to selectivity estimation for conjunctive predicates
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Optimal workload-based weighted wavelet synopses
ICDT'05 Proceedings of the 10th international conference on Database Theory
Subquadratic algorithms for workload-aware haar wavelet synopses
FSTTCS '05 Proceedings of the 25th international conference on Foundations of Software Technology and Theoretical Computer Science
SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
Improving the accuracy of histograms for geographic data objects
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Robust estimation of resource consumption for SQL queries using statistical techniques
Proceedings of the VLDB Endowment
Histograms as statistical estimators for aggregate queries
Information Systems
Streaming algorithms for data in motion
ESCAPE'07 Proceedings of the First international conference on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies
Efficiently adapting graphical models for selectivity estimation
The VLDB Journal — The International Journal on Very Large Data Bases
Data & Knowledge Engineering
Exploring optimization and caching for efficient collection operations
Automated Software Engineering
Hi-index | 0.00 |
In this paper, we introduce self-tuning histograms. Although similar in structure to traditional histograms, these histograms infer data distributions not by examining the data or a sample thereof, but by using feedback from the query execution engine about the actual selectivity of range selection operators to progressively refine the histogram. Since the cost of building and maintaining self-tuning histograms is independent of the data size, self-tuning histograms provide a remarkably inexpensive way to construct histograms for large data sets with little up-front costs. Self-tuning histograms are particularly attractive as an alternative to multi-dimensional traditional histograms that capture dependencies between attributes but are prohibitively expensive to build and maintain. In this paper, we describe the techniques for initializing and refining self-tuning histograms. Our experimental results show that self-tuning histograms provide a low-cost alternative to traditional multi-dimensional histograms with little loss of accuracy for data distributions with low to moderate skew.