A sample advisor for approximate query processing

Authors:
Philipp Rösch;Wolfgang Lehner
Affiliations:
SAP Research Center Dresden, Germany;Database Technology Group, Technische Universität Dresden, Germany
Venue:
ADBIS'10 Proceedings of the 14th east European conference on Advances in databases and information systems
Year:
2010

Citing 13
Cited 1

Random sampling for histogram construction: how much is enough?

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Join synopses for approximate query answering

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Congressional samples for approximate answering of group-by queries

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Automating Statistics Management for Query Optimizers

IEEE Transactions on Knowledge and Data Engineering
A Framework for the Physical Design Problem for Data Synopses

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Overcoming Limitations of Sampling for Aggregation Queries

Proceedings of the 17th International Conference on Data Engineering
An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Selection of Views to Materialize in a Data Warehouse

IEEE Transactions on Knowledge and Data Engineering
Recommending Materialized Views and Indexes with IBM DB2 Design Advisor

ICAC '04 Proceedings of the First International Conference on Autonomic Computing
Robustness in automatic physical database design

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Linked Bernoulli Synopses: Sampling along Foreign Keys

SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Sample synopses for approximate answering of group-by queries

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Designing Random Sample Synopses with Outliers

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering

Optimizing Sample Design for Approximate Query Processing

International Journal of Knowledge-Based Organizations

Quantified Score

Hi-index	0.00

Visualization

Abstract

The rapid growth of current data warehouse systems makes random sampling a crucial component of modern data management systems. Although there is a large body of work on database sampling, the problem of automatic sample selection remained (almost) unaddressed. In this paper, we tackle the problem with a sample advisor. We propose a cost model to evaluate a sample for a given query. Based on this, our sample advisor determines the optimal set of samples for a given set of queries specified by an expert. We further propose an extension to utilize recorded workload information. In this case, the sample advisor takes the set of queries and a given memory bound into account for the computation of a sample advice. Additionally, we consider the merge of samples in case of overlapping sample advice and present both an exact and a heuristic solution. Within our evaluation, we analyze the properties of the cost model and compare the proposed algorithms. We further demonstrate the effectiveness and the efficiency of the heuristic solutions with a variety of experiments.