Selectivity estimation by batch-query based histogram and parametric method

  • Authors:
  • Jizhou Luo;Xiaofang Zhou;Yu Zhang;Heng Tao Shen;Jianzhong Li

  • Affiliations:
  • Harbin Institute of Technology, China;University of Queensland, Australia;University of Queensland, Australia;University of Queensland, Australia;Harbin Institute of Technology, China

  • Venue:
  • ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Histograms are used extensively for selectivity estimation and approximate query processing. Workload-aware dynamic histograms can self-tune itself based on query feedback without scanning or sampling the underlaying datasets in a systematic and comprehensive way. Dynamic histograms allocate more buckets not only for the areas with most skewed data distribution but also according to users' interest. However, it takes long time to 'warm-up' (i.e., a large number of queries need to be processed before the histogram can provide a satisfactory coverage and accuracy). Thus, it is less effective to adapt with workload pattern changes. In this paper, we propose a novel online query scheduling algorithm which can significantly reduce the warm-up time for dynamic histograms. A parametric method is proposed to remedy the problem of inaccurate query selectivity estimation for the areas with poor histogram coverage. Experimental results demonstrate a significant effectiveness and accuracy improvement of our approach.