Revisiting the cube lifecycle in the presence of hierarchies

Authors:
Konstantinos Morfonios;Yannis Ioannidis
Affiliations:
IBM Almaden Research Center, San Jose, USA 95120;Department of Informatics and Telecommunications, University of Athens, Athens, Greece
Venue:
The VLDB Journal — The International Journal on Very Large Data Bases
Year:
2010

Citing 37
Cited 1

Implementing data cubes efficiently

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Algorithms for deferred view maintenance

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Cubetree: organization of and bulk incremental updates on the data cube

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Maintenance of data cubes and summary tables in a warehouse

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
An array-based algorithm for simultaneous multidimensional aggregates

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
An alternative storage organization for ROLAP aggregate views based on cubetrees

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Approximate computation of multidimensional aggregates of sparse data using wavelets

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Bottom-up computation of sparse and Iceberg CUBE

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Congressional samples for approximate answering of group-by queries

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A survey of logical models for OLAP databases

ACM SIGMOD Record
Efficient computation of Iceberg cubes with complex measures

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Dwarf: shrinking the PetaCube

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Data Mining: An Overview from a Database Perspective

IEEE Transactions on Knowledge and Data Engineering
ADMS: A Testbed for Incremental Access Methods

IEEE Transactions on Knowledge and Data Engineering
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Fast Computation of Sparse Datacubes

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Materialized View Selection for Multidimensional Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
What can Hierarchies do for Data Warehouses?

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
The TreeScape System: Reuse of Pre-Computed Aggregates over Irregular OLAP Hierarchies

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
QC-trees: an efficient summary structure for semantic OLAP

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Condensed Cube: An Efficient Approach to Reducing Data Cube Size

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Hierarchical dwarfs for the rollup cube

DOLAP '03 Proceedings of the 6th ACM international workshop on Data warehousing and OLAP
Range CUBE: Efficient Cube Computation by Exploiting Data Correlation

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
MM-Cubing: Computing Iceberg Cubes by Factorizing the Lattice Space

SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Incremental maintenance of quotient cube for median

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Incremental maintenance of quotient cube based on Galois lattice

Journal of Computer Science and Technology
Indexing and incremental updating condensed data cube

SSDBM '03 Proceedings of the 15th International Conference on Scientific and Statistical Database Management
CURE for cubes: cubing using a ROLAP engine

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient incremental maintenance of data cubes

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Pre-aggregation with probability distributions

DOLAP '06 Proceedings of the 9th ACM international workshop on Data warehousing and OLAP
Quotient cube: how to summarize the semantics of a data cube

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
ROLAP implementations of the data cube

ACM Computing Surveys (CSUR)
Star-cubing: computing iceberg cubes by top-down and bottom-up integration

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
The polynomial complexity of fully materialized coalesced cubes

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Lazy maintenance of materialized views

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Supporting the data cube lifecycle: the power of ROLAP

The VLDB Journal — The International Journal on Very Large Data Bases

Online querying of d-dimensional hierarchies

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

On-line analytical processing (OLAP) typically involves complex aggregate queries over large datasets. The data cube has been proposed as a structure that materializes the results of such queries in order to accelerate OLAP. A significant fraction of the related work has been on Relational-OLAP (ROLAP) techniques, which are based on relational technology. Existing ROLAP cubing solutions mainly focus on "flat" datasets, which do not include hierarchies in their dimensions. Nevertheless, as shown in this paper, the nature of hierarchies introduces several complications into the entire lifecycle of a data cube including the operations of construction, storage, indexing, query processing, and incremental maintenance. This fact renders existing techniques essentially inapplicable in a significant number of real-world applications and mandates revisiting the entire cube lifecycle under the new perspective. In order to overcome this problem, the CURE algorithm has been recently proposed as an efficient mechanism to construct complete cubes over large datasets with arbitrary hierarchies and store them in a highly compressed format, compatible with the relational model. In this paper, we study the remaining phases in the cube lifecycle and introduce query-processing and incremental-maintenance algorithms for CURE cubes. These are significantly different from earlier approaches, which have been proposed for flat cubes constructed by other techniques and are inadequate for CURE due to its high compression rate and the presence of hierarchies. Our methods address issues such as cube indexing, query optimization, and lazy update policies. Especially regarding updates, such lazy approaches are applied for the first time on cubes. We demonstrate the effectiveness of CURE in all phases of the cube lifecycle through experiments on both real-world and synthetic datasets. Among the experimental results, we distinguish those that have made CURE the first ROLAP technique to complete the construction and usage of the cube of the highest-density dataset in the APB-1 benchmark (12 GB). CURE was in fact quite efficient on this, showing great promise with respect to the potential of the technique overall.