Optimizing Selections over Datacubes

Authors:
Kenneth A. Ross;Kazi A. Zaman
Affiliations:
-;-
Venue:
SSDBM '00 Proceedings of the 12th International Conference on Scientific and Statistical Database Management
Year:
2000

Citing 14
Cited 4

Predicate migration: optimizing queries with expensive predicates

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Optimizing disjunctive queries with expensive predicates

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Implementing data cubes efficiently

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
An array-based algorithm for simultaneous multidimensional aggregates

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Foundations of aggregation constraints

Theoretical Computer Science
Bottom-up computation of sparse and Iceberg CUBE

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals

Data Mining and Knowledge Discovery
Fast Computation of Sparse Datacubes

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Computing Iceberg Queries Efficiently

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Query Optimization by Predicate Move-Around

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Reasoning with Aggregation Constraints

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology

A retrieval technique for high-dimensional data and partially specified queries

Data & Knowledge Engineering
Processing OLAP queries in hierarchically clustered databases

Data & Knowledge Engineering - Special issue: Advances in OLAP
Processing partially specified queries over high-dimensional databases

Data & Knowledge Engineering
Answering ad hoc aggregate queries from data streams using prefix aggregate trees

Knowledge and Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Datacube queries compute aggregates over database relations at a variety of granularities. Often one wants only datacube output tuples whose aggregate value satisfies a certain condition, such as exceeding a given threshold. We develop algorithms for processing a datacube query using the selection condition internally during the computation. Thus, we can safely prune parts of the computation and end up with a more efficient computation of the answer. Our first technique, called 驴specialization驴, uses the fact that a tuple in the datacube does not meet the given threshold to infer that not all finer level aggregates can meet the threshold. Our second technique is called 驴generalization驴, and applies in the case where the actual value of the aggregate is not needed in the output, but used just to compare with the threshold. We demonstrate the efficiency of these techniques by implementing them within the sparse datacube algorithm of Ross and Srivastava. We present a performance study using synthetic and real-world data sets. Our results indicate substantial performance improvements for queries with selective conditions.