Random sampling with a reservoir
ACM Transactions on Mathematical Software (TOMS)
Physical database design for relational databases
ACM Transactions on Database Systems (TODS)
Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
AutoAdmin “what-if” index analysis utility
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Wavelet-based histograms for selectivity estimation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Join synopses for approximate query answering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Ripple joins for online aggregation
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Congressional samples for approximate answering of group-by queries
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Automating Statistics Management for Query Optimizers
IEEE Transactions on Knowledge and Data Engineering
A Framework for the Physical Design Problem for Data Synopses
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Overcoming Limitations of Sampling for Aggregation Queries
Proceedings of the 17th International Conference on Data Engineering
Histogram-Based Approximation of Set-Valued Query-Answers
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Automated Selection of Materialized Views and Indexes in SQL Databases
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Approximate Query Processing Using Wavelets
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Dynamic sample selection for approximate query processing
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Selection of Views to Materialize in a Data Warehouse
IEEE Transactions on Knowledge and Data Engineering
A disk-based join with probabilistic guarantees
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Recommending Materialized Views and Indexes with IBM DB2 Design Advisor
ICAC '04 Proceedings of the First International Conference on Autonomic Computing
Scalable approximate query processing with the DBO engine
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Robustness in automatic physical database design
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Linked Bernoulli Synopses: Sampling along Foreign Keys
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Sample synopses for approximate answering of group-by queries
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Designing Random Sample Synopses with Outliers
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
A sample advisor for approximate query processing
ADBIS'10 Proceedings of the 14th east European conference on Advances in databases and information systems
HYRISE: a main memory hybrid storage engine
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
The rapid increase of data volumes makes sampling a crucial component of modern data management systems. Although there is a large body of work on database sampling, the problem of automatically determine the optimal sample for a given query remained almost unaddressed. To tackle this problem the authors propose a sample advisor based on a novel cost model. Primarily designed for advising samples of a few queries specified by an expert, the authors additionally propose two extensions of the sample advisor. The first extension enhances the applicability by utilizing recorded workload information and taking memory bounds into account. The second extension increases the effectiveness by merging samples in case of overlapping pieces of sample advice. For both extensions, the authors present exact and heuristic solutions. Within their evaluation, the authors analyze the properties of the cost model and demonstrate the effectiveness and the efficiency of the heuristic solutions with a variety of experiments.