Bichromatic buckets: An effective technique to improve the accuracy of histograms for geographic data points

  • Authors:
  • Hai Thanh Mai;Jaeho Kim;Myoung Ho Kim

  • Affiliations:
  • -;-;-

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Histograms have been widely used for estimating selectivity in query optimization. In this paper, we propose a new technique to improve the accuracy of histograms for two-dimensional geographic data points that are used in many real-world Geographic Information Systems. Typically, a histogram consists of a collection of rectangular regions, called buckets. The main idea of our technique is to use a straight line to convert each rectangular bucket to a new one with two separating regions. The converted buckets, called bichromatic buckets, can approximate the distribution of data objects better while preserving the simplicity of originally rectangular ones. To construct bichromatic buckets, we propose an adaptive algorithm to find good separating lines. Two strategies to find the separating lines, one based on the potential skewness gains of the candidate lines and the other based on the difference of density levels of the data regions, are proposed and used flexibly within our algorithm. Then, we describe how to apply the proposed technique to existing histogram construction methods to improve the accuracy of the constructed histograms further. Results from extensive experiments using real-life data sets demonstrate that our technique improves the accuracy of the histograms by 2 times on average.