CURE for cubes: cubing using a ROLAP engine

Authors:
Konstantinos Morfonios;Yannis Ioannidis
Affiliations:
Dept. of Informatics and Telecom. Univ. of Athens;Dept. of Informatics and Telecom. Univ. of Athens
Venue:
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Year:
2006

Citing 19
Cited 20

Implementing data cubes efficiently

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
An array-based algorithm for simultaneous multidimensional aggregates

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Bottom-up computation of sparse and Iceberg CUBE

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Efficient computation of Iceberg cubes with complex measures

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Dwarf: shrinking the PetaCube

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Fast Computation of Sparse Datacubes

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
What can Hierarchies do for Data Warehouses?

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
QC-trees: an efficient summary structure for semantic OLAP

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Hierarchical dwarfs for the rollup cube

DOLAP '03 Proceedings of the 6th ACM international workshop on Data warehousing and OLAP
Range CUBE: Efficient Cube Computation by Exploiting Data Correlation

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
MM-Cubing: Computing Iceberg Cubes by Factorizing the Lattice Space

SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Incremental maintenance of quotient cube for median

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient computation of multiple group by queries

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Quotient cube: how to summarize the semantics of a data cube

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Star-cubing: computing iceberg cubes by top-down and bottom-up integration

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
High-dimensional OLAP: a minimal cubing approach

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
The polynomial complexity of fully materialized coalesced cubes

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Progressive and selective merge: computing top-k with ad-hoc ranking functions

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
ROLAP implementations of the data cube

ACM Computing Surveys (CSUR)
Efficient computation of view subsets

Proceedings of the ACM tenth international workshop on Data warehousing and OLAP
Mapgraph: efficient methods for complex olap hierarchies

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Mining approximate top-k subspace anomalies in multi-dimensional time-series data

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Why go logarithmic if we can go linear?: Towards effective distinct counting of search traffic

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Supporting the data cube lifecycle: the power of ROLAP

The VLDB Journal — The International Journal on Very Large Data Bases
Emerging Cubes: Borders, size estimations and lossless reductions

Information Systems
Closed Non Derivable Data Cubes Based on Non Derivable Minimal Generators

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
BitCube: A Bottom-Up Cubing Engineering

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
High Performance Analytics with the R3-Cache

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Parallel OLAP with the Sidera server

Future Generation Computer Systems
Reduced representations of Emerging Cubes for OLAP database mining

International Journal of Business Intelligence and Data Mining
Revisiting the cube lifecycle in the presence of hierarchies

The VLDB Journal — The International Journal on Very Large Data Bases
Sidera: a cluster-based server for online analytical processing

OTM'07 Proceedings of the 2007 OTM confederated international conference on On the move to meaningful internet systems: CoopIS, DOA, ODBASE, GADA, and IS - Volume Part II
The NOX framework: native language queries for business intelligence applications

DaWaK'10 Proceedings of the 12th international conference on Data warehousing and knowledge discovery
Online querying of d-dimensional hierarchies

Journal of Parallel and Distributed Computing
Extracting semantics in OLAP databases using emerging cubes

Information Sciences: an International Journal
The NOX OLAP query model: from algebra to execution

DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
Towards a scalable, performance-oriented OLAP storage engine

DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data cube construction has been the focus of much research due to its importance in improving efficiency of OLAP. A significant fraction of this work has been on ROLAP techniques, which are based on relational technology. Existing ROLAP cubing solutions mainly focus on "flat" datasets, which do not include hierarchies in their dimensions. Nevertheless, the nature of hierarchies introduces several complications into cube construction, making existing techniques essentially inapplicable in a significant number of real-world applications. In particular, hierarchies raise three main challenges: (a) The number of nodes in a cube lattice increases dramatically and its shape is more involved. These require new forms of lattice traversal for efficient execution. (b) The number of unique values in the higher levels of a dimension hierarchy may be very small; hence, partitioning data into fragments that fit in memory and include all entries of a particular value may often be impossible. This requires new partitioning schemes. (c) The number of tuples that need to be materialized in the final cube increases dramatically. This requires new storage schemes that remove all forms of redundancy for efficient space utilization. In this paper, we propose CURE, a novel ROLAP cubing method that addresses these issues and constructs complete data cubes over very large datasets with arbitrary hierarchies. CURE contributes a novel lattice traversal scheme, an optimized partitioning method, and a suite of relational storage schemes for all forms of redundancy. We demonstrate the effectiveness of CURE through experiments on both real-world and synthetic datasets. Among the experimental results, we distinguish those that have made CURE the first ROLAP technique to complete the construction of the cube of the highest-density dataset in the APB-1 benchmark (12 GB). CURE was in fact quite efficient on this, showing great promise with respect to the potential of the technique overall.