A new histogram method for sparse attributes: the averaged rectangular attribute cardinality map

  • Authors:
  • B. John Oommen;Jing Chen

  • Affiliations:
  • Carleton University, Ottawa/ Canada;Carleton University, Ottawa/ Canada

  • Venue:
  • ISICT '03 Proceedings of the 1st international symposium on Information and communication technologies
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most current Database Management Systems (DBMS) use histograms in their query optimization, and in approximating query result sizes. This is because they can be utilized in determining efficient query evaluation plans. All the existing methods perform poorly when the attributes of a relation are very sparsely distributed, also called the "sparse data cases". These cases are the worst-cases scenarios for attributes with skewed distributions. In this paper, we propose a novel histogram-based algorithm, namely the Averaged Rectangular Attribute Cardinality Map (Averaged R-ACM), and demonstrate its performance in estimating query result sizes for the sparse data cases. Our proposed algorithm combines the advantages of the traditional widely-used histogram-based algorithm, namely the Equi-width histogram, and a relatively new algorithm, namely the R-ACM2 introduced in [Thi99]. The goals of compacting the sparse data distribution and of obtaining accurate estimates of query result sizes are achieved by utilizing this algorithm. The superiority of this algorithm is also validated by an extensive set of experiments. And the entire set of experimental results obtained by integrating the underlying algorithm and other histogram-based algorithms into the ORACLE query optimizer can be found in [Che03].