Optimizing Sample Design for Approximate Query Processing

  • Authors:
  • Philipp Rösch;Wolfgang Lehner

  • Affiliations:
  • Business Intelligence Practice, SAP Research, Dresden, Germany;Database Technology Research Group, Dresden University of Technology, Dresden, Germany

  • Venue:
  • International Journal of Knowledge-Based Organizations
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The rapid increase of data volumes makes sampling a crucial component of modern data management systems. Although there is a large body of work on database sampling, the problem of automatically determine the optimal sample for a given query remained almost unaddressed. To tackle this problem the authors propose a sample advisor based on a novel cost model. Primarily designed for advising samples of a few queries specified by an expert, the authors additionally propose two extensions of the sample advisor. The first extension enhances the applicability by utilizing recorded workload information and taking memory bounds into account. The second extension increases the effectiveness by merging samples in case of overlapping pieces of sample advice. For both extensions, the authors present exact and heuristic solutions. Within their evaluation, the authors analyze the properties of the cost model and demonstrate the effectiveness and the efficiency of the heuristic solutions with a variety of experiments.