Binary-Tree Histograms with Tree Indices

Authors:
Francesco Buccafurri;Filippo Furfaro;Gianluca Lax;Domenico Saccà
Affiliations:
-;-;-;-
Venue:
DEXA '02 Proceedings of the 13th International Conference on Database and Expert Systems Applications
Year:
2002

Citing 10
Cited 4

A universal-scheme approach to statistical databases containing homogeneous summary tables

ACM Transactions on Database Systems (TODS)
Balancing histogram optimality and practicality for query result size estimation

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Improved histograms for selectivity estimation of range predicates

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Wavelet-based histograms for selectivity estimation

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Implications of certain assumptions in database performance evauation

ACM Transactions on Database Systems (TODS)
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Estimating Range Queries Using Aggregate Data with Integrity Constraints: A Probabilistic Approach

ICDT '01 Proceedings of the 8th International Conference on Database Theory
Recovering Information from Summary Data

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Improving Temporal Joins Using Histograms

DEXA '00 Proceedings of the 11th International Conference on Database and Expert Systems Applications
Improving Range Query Estimation on Histograms

ICDE '02 Proceedings of the 18th International Conference on Data Engineering

Fast range query estimation by N-level tree histograms

Data & Knowledge Engineering
H-IQTS: a semantics-aware histogram for compressing categorical OLAP data

IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
Enabling OLAP in mobile environments via intelligent data cube compression techniques

Journal of Intelligent Information Systems
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches

Foundations and Trends in Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many application contexts, like statistical databases, transaction recording systems, scientific databases, query optimizers, OLAP, and so on, data are summarized as histograms of aggregate values. When the task of reconstructing range queries on original data from aggregate data is performed, a certain estimation error cannot be avoided, due to the loss of information in compressing data. Error size strongly depends both on how histograms partition data domains and on how estimation inside each bucket is done. We propose a new type of histogram, based on an unbalanced binary-tree partition, suitable for providing quick answers to hierarchical range queries, and we use adaptive tree-indexing for better approximating frequencies inside buckets. As the results from our experiments demonstrate, our histogram behaves considerably better than state-of-the-art histograms, showing smaller errors in all considered data sets at the same storage space.