Aggregate Aware Caching for Multi-Dimensional Queries

Authors:
Prasad Deshpande;Jeffrey F. Naughton
Affiliations:
-;-
Venue:
EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Year:
2000

Citing 12
Cited 1

Implementing data cubes efficiently

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Dynamic assembly of views in data cubes

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Caching multidimensional queries using chunks

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
DynaMat: a dynamic view management system for data warehouses

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Complex Aggregation at Multiple Granularities

EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Efficient Organization of Large Multidimensional Arrays

Proceedings of the Tenth International Conference on Data Engineering
Materialized View Selection for Multidimensional Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Semantic Data Caching and Replacement

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Answering Queries with Aggregation Using Views

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
WATCHMAN: A Data Warehouse Intelligent Cache Manager

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Efficient database support for olap queries (on-line analytical processing)

Efficient database support for olap queries (on-line analytical processing)

High Performance Analytics with the R3-Cache

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

To date, work on caching for OLAP workloads has focussed on using cached results from a previous query as the answer to another query. This strategy is effective when the query stream exhibits a high degree of locality. It unfortunately misses the dramatic performance improvements obtainable when the answer to a query, while not immediately available in the cache, can be computed from data in the cache. In this paper, we consider the common subcase of answering queries by aggregating data in the cache. In order to use aggregation in the cache, one must solve two subproblems: (1) determining when it is possible to answer a query by aggregating data in the cache, and (2) determining the fastest path for this aggregation, since there can be many.We present two strategies - a naive one and a Virtual Count based strategy. The virtual count based method finds if a query is computable from the cache almost instantaneously, with a small overhead of maintaining the summary state of the cache. The algorithm also maintains cost-based information that can be used to figure out the best possible option for computing a query result from the cache. Experiments with our implementation show that aggregation in the cache leads to substantial performance improvement. The virtual count based methods further improve the performance compared to the naive approaches, in terms of cache lookup and aggregation times.