Range Selectivity Estimation for Continuous Attributes

  • Authors:
  • Flip Korn;Theodore Johnson;H. V. Jagadish

  • Affiliations:
  • -;-;-

  • Venue:
  • SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many commercial database systems maintain histograms to efficiently estimate query selectivities as part of query optimization. Most work on histogram design is implicitly geared towards discrete or categorical attribute value domains. In this paper, we consider approaches that are better suited for the continuous valued attributes commonly found in scientific and statistical databases. We propose two methods based on spline functions for estimating the selectivity of range queries over univariate and multi-variate data.These methods are more accurate than histograms. As the results from our experiments on both real and synthetic data sets demonstrate, the proposed methods achieved substantially better (up to 5.5 times) estimation error than the state-of-the-art histograms, at exactly the same storage space and with comparable CPU runtime overhead; moreover, the superiority of the proposed spline methods is amplified when applied to multivariate data.