Fast algorithms for hierarchical range histogram construction

  • Authors:
  • Sudipto Guha;Nick Koudas;Divesh Srivastava

  • Affiliations:
  • UPenn;AT&T Labs-Research;AT&T Labs-Research

  • Venue:
  • Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data Warehousing and OLAP applications typically view data an having multiple logical dimensions (e.g., product, location) with natural hierarchies defined on each dimension. OLAP queries usually involve hierarchical selections on some of the dimensions, and often aggregate measure attributes (e.g., sales, volume). Accurately estimating the distribution of measure attributes, under hierarchical selections, is important in a variety of scenarios, including approximate query evaluation and cost-based optimization of queries.In this paper, we propose fast (near linear time) algorithms for the problem of approximating the distribution of measure attributes with hierarchies defined on them, using histograms. Our algorithms are based on dynamic programming and a novel notion of sparse intervals that we introduce, and are the first practical algorithms for this problem. They effectively trade space for construction time without compromising histogram accuracy. We complement our analytical contributions with an experimental evaluation using real data sets, demonstrating the superiority of our approach.