A multi-dimensional histogram for selectivity estimation and fast approximate query answering

  • Authors:
  • Hai Wang;Kenneth C. Sevcik

  • Affiliations:
  • Department of Computer Science, University of Toronto;Department of Computer Science, University of Toronto

  • Venue:
  • CASCON '03 Proceedings of the 2003 conference of the Centre for Advanced Studies on Collaborative research
  • Year:
  • 2003

Quantified Score

Hi-index 0.01

Visualization

Abstract

Histograms have been widely used for selectivity estimation in query optimization, as well as for fast approximate query answering in many data mining, OLAP, and data visualization applications. This paper presents a new type of multi-dimensional histogram, the multi-dimensional VI histogram. Unlike other types of multi-dimensional histograms, which are seldom used in practice due to their high construction costs, the multi-dimensional VI histogram can be constructed in just one scan through the data. Through a set of experiments, we show that the multi-dimensional VI histogram is capable of providing more accurate estimations than the techniques currently used in major commercial database management systems, including IBM DB2, Oracle Database, and Microsoft SQL Server.