Equi-depth multidimensional histograms
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Statistical profile estimation in database systems
ACM Computing Surveys (CSUR)
Processing aggregate relational queries with hard time constraints
SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Estimating the size of generalized transitive closures
VLDB '89 Proceedings of the 15th international conference on Very large data bases
VLDB '89 Proceedings of the 15th international conference on Very large data bases
Query size estimation by adaptive sampling (extended abstract)
PODS '90 Proceedings of the ninth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Statistical estimators for relational algebra expressions
Proceedings of the seventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
A model of data distribution based on texture analysis
SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
Query optimization in star computer networks
ACM Transactions on Database Systems (TODS)
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Estimating block transfers and join sizes
SIGMOD '83 Proceedings of the 1983 ACM SIGMOD international conference on Management of data
Database evaluation using multiple regression techniques
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Accurate estimation of the number of tuples satisfying a condition
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Simple Random Sampling from Relational Databases
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Benchmarking Database Systems A Systematic Approach
VLDB '83 Proceedings of the 9th International Conference on Very Large Data Bases
Error-constrained COUNT query evaluation in relational databases
SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Join processing in relational databases
ACM Computing Surveys (CSUR)
Sequential sampling procedures for query size estimation
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Multiple join size estimation by virtual domains (extended abstract)
PODS '93 Proceedings of the twelfth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Fixed-precision estimation of join selectivity
PODS '93 Proceedings of the twelfth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
On the development of a site selection optimizer for distributed and parallel database systems
CIKM '93 Proceedings of the second international conference on Information and knowledge management
Using statistical sampling for query optimization in heterogeneous library information systems
CSC '93 Proceedings of the 1993 ACM conference on Computer science
On the relative cost of sampling for join selectivity estimation
PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
The power of sampling in knowledge discovery
PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Combinatorial pattern discovery for scientific data: some preliminary results
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Adaptive selectivity estimation using query feedback
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Computation of partial query results with an adaptive stratified sampling technique
CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
Balancing histogram optimality and practicality for query result size estimation
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Bifocal sampling for skew-resistant join size estimation
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Estimating alphanumeric selectivity in the presence of wildcards
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Adaptive Algorithms for Join Processing in Distributed Database Systems
Distributed and Parallel Databases
Random sampling for histogram construction: how much is enough?
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Wavelet-based histograms for selectivity estimation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Data cube approximation and histograms via wavelets
Proceedings of the seventh international conference on Information and knowledge management
Iterated DFT based techniques for join size estimation
Proceedings of the seventh international conference on Information and knowledge management
Solving Local Cost Estimation Problem for Global Query Optimization in Multidatabase Systems
Distributed and Parallel Databases
Tracking join and self-join sizes in limited storage
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Selectivity estimation in spatial databases
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Self-tuning histograms: building histograms without looking at data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Join synopses for approximate query answering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
On approximating rectangle tiling and packing
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Approximating multi-dimensional aggregate range queries over real attributes
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Modeling high-dimensional index structures using sampling
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Applying the golden rule of sampling for query estimation
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Selectivity estimation using probabilistic models
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Exploiting constraint-like data characterizations in query optimization
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Probabilistic query models for transaction data
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
SchemaSQL: An extension to SQL for multidatabase interoperability
ACM Transactions on Database Systems (TODS)
Fast incremental maintenance of approximate histograms
ACM Transactions on Database Systems (TODS)
Cost models for overlapping and multiversion structures
ACM Transactions on Database Systems (TODS)
Effective Query Size Estimation Using Neural Networks
Applied Intelligence
Approximate Query Answering Using Data Warehouse Striping
Journal of Intelligent Information Systems - Special issue on data warehousing and knowledge discovery
A Hybrid Estimator for Selectivity Estimation
IEEE Transactions on Knowledge and Data Engineering
Reducing the Braking Distance of an SQL Query Engine
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Histogram-Based Approximation of Set-Valued Query-Answers
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Approximate Query Processing Using Wavelets
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
Approximate Query Processing: Taming the TeraBytes
Proceedings of the 27th International Conference on Very Large Data Bases
Optimizing Boolean Expressions in Object-Bases
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Random Sampling from Pseudo-Ranked B+ Trees
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
SchemaSQL - A Language for Interoperability in Relational Multi-Database Systems
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Selectivity Estimation Without the Attribute Value Independence Assumption
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Approximate Query Answering In Numerical Databases
SSDBM '96 Proceedings of the Eighth International Conference on Scientific and Statistical Database Management
Performance Analysis of Database Systems
Performance Evaluation: Origins and Directions
Join algorithm costs revisited
The VLDB Journal — The International Journal on Very Large Data Bases
Query processing and optimization in Oracle Rdb
The VLDB Journal — The International Journal on Very Large Data Bases
Approximate query processing using wavelets
The VLDB Journal — The International Journal on Very Large Data Bases
Multiple-granularity interleaving for piggyback query processing
CASCON '99 Proceedings of the 1999 conference of the Centre for Advanced Studies on Collaborative research
A piggyback method to collect statistics for query optimization in database management systems
CASCON '98 Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research
Multi-resolution algorithms for building spatial histograms
ADC '03 Proceedings of the 14th Australasian database conference - Volume 17
A learning-based approach to estimate statistics of operators in continuous queries: a case study
DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets
IEEE Transactions on Knowledge and Data Engineering
Beyond Independence: Probabilistic Models for Query Approximation on Binary Transaction Data
IEEE Transactions on Knowledge and Data Engineering
Interchanging group-by and join in distributed query processing
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
An integrated method for estimating selectivities in a multidatabase system
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
Query Size Estimation for Joins Using Systematic Sampling
Distributed and Parallel Databases
A Selectivity Model for Fragmented Relations: Applied in Information Retrieval
IEEE Transactions on Knowledge and Data Engineering
Effective use of block-level sampling in statistics estimation
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Venn Sampling: A Novel Prediction Technique for Moving Objects
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Synopses for query optimization: a space-complexity perspective
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Selectivity estimators for multidimensional range queries over real attributes
The VLDB Journal — The International Journal on Very Large Data Bases
Towards a robust query optimizer: a principled and practical approach
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Relational confidence bounds are easy with the bootstrap
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Synopses for query optimization: A space-complexity perspective
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2004
Summarizing level-two topological relations in large spatial datasets
ACM Transactions on Database Systems (TODS)
ACM Transactions on Database Systems (TODS)
Resource control for java database extensions
COOTS'99 Proceedings of the 5th conference on USENIX Conference on Object-Oriented Technologies & Systems - Volume 5
Selectivity estimation by batch-query based histogram and parametric method
ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
Multiscale histograms: summarizing topological relations in large spatial datasets
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Probabilistic skylines on uncertain data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Adaptive-sampling algorithms for answering aggregation queries on Web sites
Data & Knowledge Engineering
Analytic-based estimation of query result sizes
AIKED'05 Proceedings of the 4th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering Data Bases
Confidence bounds for sampling-based group by estimates
ACM Transactions on Database Systems (TODS)
Distinct value estimation on peer-to-peer networks
Proceedings of the 1st international conference on PErvasive Technologies Related to Assistive Environments
Query evaluation and optimization in the semantic web
Theory and Practice of Logic Programming
ACM Transactions on Computer Systems (TOCS)
A sampling approach for XML query selectivity estimation
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Depth estimation for ranking query optimization
The VLDB Journal — The International Journal on Very Large Data Bases
Sampling-based estimators for subset-based queries
The VLDB Journal — The International Journal on Very Large Data Bases
Progressive Evaluation of XML Queries for Online Aggregation and Progress Indicator
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Exact cardinality query optimization for optimizer testing
Proceedings of the VLDB Endowment
Adaptive dimensionality reduction for recent-biased time series analysis
Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
Result-size estimation for information-retrieval subqueries
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
The VC-dimension of SQL queries and selectivity estimation through sampling
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Practical algorithms for tracking database join sizes
FSTTCS '05 Proceedings of the 25th international conference on Foundations of Software Technology and Theoretical Computer Science
Selectivity estimation of high dimensional window queries via clustering
SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
Improving the accuracy of histograms for geographic data objects
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Journal of Computer and System Sciences
CS2: a new database synopsis for query estimation
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Data Quality of Query Results with Generalized Selection Conditions
Operations Research
Entropy-based histograms for selectivity estimation
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Proceedings of the VLDB Endowment
Data & Knowledge Engineering
Hi-index | 0.00 |
Recently we have proposed an adaptive, random sampling algorithm for general query size estimation. In earlier work we analyzed the asymptotic efficiency and accuracy of the algorithm, in this paper we investigate its practicality as applied to selects and joins. First, we extend our previous analysis to provide significantly improved bounds on the amount of sampling necessary for a given level of accuracy. Next, we provide “sanity bounds” to deal with queries for which the underlying data is extremely skewed or the query result is very small. Finally, we report on the performance of the estimation algorithm as implemented in a host language on a commercial relational system. The results are encouraging, even with this loose coupling between the estimation algorithm and the DBMS.