Processing aggregate relational queries with hard time constraints
SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Estimating the size of generalized transitive closures
VLDB '89 Proceedings of the 15th international conference on Very large data bases
VLDB '89 Proceedings of the 15th international conference on Very large data bases
Practical selectivity estimation through adaptive sampling
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Random sampling from hash files
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Error-constrained COUNT query evaluation in relational databases
SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Sequential sampling procedures for query size estimation
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Fixed-precision estimation of join selectivity
PODS '93 Proceedings of the twelfth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Probabilistic methods in query processing
Probabilistic methods in query processing
Query size estimation by adaptive sampling (extended abstract)
PODS '90 Proceedings of the ninth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Statistical estimators for relational algebra expressions
Proceedings of the seventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Simple Random Sampling from Relational Databases
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Practical Skew Handling in Parallel Joins
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Bifocal sampling for skew-resistant join size estimation
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Perspectives on database theory
ACM SIGACT News
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Join synopses for approximate query answering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Ripple joins for online aggregation
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Approximate Query Answering Using Data Warehouse Striping
Journal of Intelligent Information Systems - Special issue on data warehousing and knowledge discovery
Query sampling in DB2 Universal Database
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Sampling search-engine results
WWW '05 Proceedings of the 14th international conference on World Wide Web
Random Sampling for Continuous Streams with Arbitrary Updates
IEEE Transactions on Knowledge and Data Engineering
Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more
Adaptive-sampling algorithms for answering aggregation queries on Web sites
Data & Knowledge Engineering
Caching and incrementalisation in the java query language
Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
The VC-dimension of SQL queries and selectivity estimation through sampling
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Efficient object querying for java
ECOOP'06 Proceedings of the 20th European conference on Object-Oriented Programming
A new framework for join product skew
RED'10 Proceedings of the Third international conference on Resource Discovery
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
Hi-index | 0.00 |
We compare the cost of estimating the selectivity of a “star join” using sampling procedure t-cross to the cost of simply computing the join and obtaining the exact answer. Our bounds and approximations for the relative cost of sampling show how this cost depends on the size of the input relations, the number of input relations, and the precision criterion used by the estimation procedure. We also demonstrate the deleterious effect of dangling tuples and the mixed effect of data skew on the relative cost of sampling. These results provide insight into when sampling should or should not be used for join selectivity estimation.