Sharing-aware horizontal partitioning for exploiting correlations during query processing

  • Authors:
  • Kostas Tzoumas;Amol Deshpande;Christian S. Jensen

  • Affiliations:
  • Aalborg University, Denmark;University of Maryland, College Park, MD;Aarhus University, Denmark

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Optimization of join queries based on average selectivities is suboptimal in highly correlated databases. In such databases, relations are naturally divided into partitions, each partition having substantially different statistical characteristics. It is very compelling to discover such data partitions during query optimization and create multiple plans for a given query, one plan being optimal for a particular combination of data partitions. This scenario calls for the sharing of state among plans, so that common intermediate results are not recomputed. We study this problem in a setting with a routing-based query execution engine based on eddies [1]. Eddies naturally encapsulate horizontal partitioning and maximal state sharing across multiple plans. We define the notion of a conditional join plan, a novel representation of the search space that enables us to address the problem in a principled way. We present a low-overhead greedy algorithm that uses statistical summaries based on graphical models. Experimental results suggest an order of magnitude faster execution time over traditional optimization for high correlations, while maintaining the same performance for low correlations.