Regeneration and networks of queues
Regeneration and networks of queues
Statistical profile estimation in database systems
ACM Computing Surveys (CSUR)
Processing aggregate relational queries with hard time constraints
SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Estimating the size of generalized transitive closures
VLDB '89 Proceedings of the 15th international conference on Very large data bases
VLDB '89 Proceedings of the 15th international conference on Very large data bases
Practical selectivity estimation through adaptive sampling
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Error-constrained COUNT query evaluation in relational databases
SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Query size estimation by adaptive sampling (extended abstract)
PODS '90 Proceedings of the ninth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Statistical estimators for relational algebra expressions
Proceedings of the seventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
An effective algorithm for parallelizing sort merge joins in the presence of data skew
DPDS '90 Proceedings of the second international symposium on Databases in parallel and distributed systems
Query Optimization in Database Systems
ACM Computing Surveys (CSUR)
A bayesian approach to selected database issues
A bayesian approach to selected database issues
Fixed-precision estimation of join selectivity
PODS '93 Proceedings of the twelfth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
On the development of a site selection optimizer for distributed and parallel database systems
CIKM '93 Proceedings of the second international conference on Information and knowledge management
On the relative cost of sampling for join selectivity estimation
PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
The power of sampling in knowledge discovery
PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Combinatorial pattern discovery for scientific data: some preliminary results
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Adaptive selectivity estimation using query feedback
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Computation of partial query results with an adaptive stratified sampling technique
CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
Balancing histogram optimality and practicality for query result size estimation
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Bifocal sampling for skew-resistant join size estimation
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Estimating alphanumeric selectivity in the presence of wildcards
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Random sampling for histogram construction: how much is enough?
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Wavelet-based histograms for selectivity estimation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Data cube approximation and histograms via wavelets
Proceedings of the seventh international conference on Information and knowledge management
Approximating multi-dimensional aggregate range queries over real attributes
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Dynamic multidimensional histograms
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Wavelet synopses with error guarantees
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Effective Query Size Estimation Using Neural Networks
Applied Intelligence
Processing Distributed Mobile Queries with Interleaved Remote Mobile Joins
IEEE Transactions on Computers
Optimization of Parallel Execution for Multi-Join Queries
IEEE Transactions on Knowledge and Data Engineering
A Hybrid Estimator for Selectivity Estimation
IEEE Transactions on Knowledge and Data Engineering
Reducing the Braking Distance of an SQL Query Engine
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Selectivity Estimation in Extensible Databases - A Neural Network Approach
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Managing Memory to Meet Multiclass Workload Response Time Goals
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Approximate Query Answering In Numerical Databases
SSDBM '96 Proceedings of the Eighth International Conference on Scientific and Statistical Database Management
Join algorithm costs revisited
The VLDB Journal — The International Journal on Very Large Data Bases
Query processing and optimization in Oracle Rdb
The VLDB Journal — The International Journal on Very Large Data Bases
Handbook of data mining and knowledge discovery
Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets
IEEE Transactions on Knowledge and Data Engineering
Interchanging group-by and join in distributed query processing
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
An integrated method for estimating selectivities in a multidatabase system
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
A new histogram method for sparse attributes: the averaged rectangular attribute cardinality map
ISICT '03 Proceedings of the 1st international symposium on Information and communication technologies
Query Size Estimation for Joins Using Systematic Sampling
Distributed and Parallel Databases
Probabilistic wavelet synopses
ACM Transactions on Database Systems (TODS)
Effective use of block-level sampling in statistics estimation
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Deterministic wavelet thresholding for maximum-error metrics
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Synopses for query optimization: a space-complexity perspective
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Selectivity estimators for multidimensional range queries over real attributes
The VLDB Journal — The International Journal on Very Large Data Bases
Towards a robust query optimizer: a principled and practical approach
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
One-pass wavelet synopses for maximum-error metrics
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Wavelet synopses for general error metrics
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2004
Synopses for query optimization: A space-complexity perspective
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2004
Approximate quantiles and the order of the stream
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Rk-hist: an r-tree based histogram for multi-dimensional selectivity estimation
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Analytic-based estimation of query result sizes
AIKED'05 Proceedings of the 4th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering Data Bases
Generating targeted queries for database testing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Distinct value estimation on peer-to-peer networks
Proceedings of the 1st international conference on PErvasive Technologies Related to Assistive Environments
Feature-preserved sampling over streaming data
ACM Transactions on Knowledge Discovery from Data (TKDD)
The design of a query monitoring system
ACM Transactions on Database Systems (TODS)
Kernel-based skyline cardinality estimation
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
An experimental study of time-constrained aggregate queries
Proceedings of the 13th International Conference on Extending Database Technology
Hierarchically organized skew-tolerant histograms for geographic data objects
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
The VC-dimension of SQL queries and selectivity estimation through sampling
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
ROS: run-time optimization of SPARQL queries
WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II
Efficient construction of histograms for multidimensional data using quad-trees
Decision Support Systems
HASE: a hybrid approach to selectivity estimation for conjunctive predicates
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
A metropolis sampling method for drawing representative samples from large databases
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
Improving the accuracy of histograms for geographic data objects
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Automated physical designers: what you see is (not) what you get
DBTest '12 Proceedings of the Fifth International Workshop on Testing Database Systems
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
CS2: a new database synopsis for query estimation
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Data & Knowledge Engineering
Hi-index | 0.00 |
We provide a procedure, based on random sampling, for estimation of the size of a query result. The procedure is sequential in that sampling terminates after a random number of steps according to a stopping rule that depends upon the observations obtained so far. Enough observations are obtained so that, with a pre-specified probability, the estimate differs from the true size of the query result by no more than a prespecified amount. Unlike previous sequential estimation procedures for queries, our procedure is asymptotically efficient and requires no ad hoc pilot sample or a a priori assumptions about data characteristics. In addition to establishing the asymptotic properties of the estimation procedure, we provide techniques for reducing undercoverage at small sample sizes and show that the sampling cost of the procedure can be reduced through stratified sampling techniques.