Range CUBE: Efficient Cube Computation by Exploiting Data Correlation

Authors:
Ying Feng;Divyakant Agrawal;Amr El Abbadi;Ahmed Metwally
Affiliations:
-;-;-;-
Venue:
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Year:
2004

Citing 15
Cited 15

An array-based algorithm for simultaneous multidimensional aggregates

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Quasi-cubes: exploiting approximations in multidimensional databases

ACM SIGMOD Record
Bottom-up computation of sparse and Iceberg CUBE

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Compressed data cubes for OLAP aggregate query approximation on continuous dimensions

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient computation of Iceberg cubes with complex measures

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Dwarf: shrinking the PetaCube

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Fast Computation of Sparse Datacubes

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Computing Iceberg Queries Efficiently

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Using Loglinear Models to Compress Datacube

WAIM '00 Proceedings of the First International Conference on Web-Age Information Management
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
QC-trees: an efficient summary structure for semantic OLAP

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Condensed Cube: An Efficient Approach to Reducing Data Cube Size

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Quotient cube: how to summarize the semantics of a data cube

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Star-cubing: computing iceberg cubes by top-down and bottom-up integration

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29

Efficient computation of the skyline cube

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Communication and Memory Optimal Parallel Data Cube Construction

IEEE Transactions on Parallel and Distributed Systems
CURE for cubes: cubing using a ROLAP engine

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient incremental maintenance of data cubes

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Towards multidimensional subspace skyline analysis

ACM Transactions on Database Systems (TODS)
ROLAP implementations of the data cube

ACM Computing Surveys (CSUR)
Why go logarithmic if we can go linear?: Towards effective distinct counting of search traffic

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Supporting the data cube lifecycle: the power of ROLAP

The VLDB Journal — The International Journal on Very Large Data Bases
Answering aggregate keyword queries on relational databases using minimal group-bys

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
A Multiple Correspondence Analysis to Organize Data Cubes

Proceedings of the 2007 conference on Databases and Information Systems IV: Selected Papers from the Seventh International Baltic Conference DB&IS'2006
BitCube: A Bottom-Up Cubing Engineering

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
An efficient method for maintaining data cubes incrementally

Information Sciences: an International Journal
Revisiting the cube lifecycle in the presence of hierarchies

The VLDB Journal — The International Journal on Very Large Data Bases
Adapting OLAP analysis to the user's interest through virtual cubes

FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery
On the computation of maximal-correlated cuboids cells

DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data cube computation and representation are prohibitivelyexpensive in terms of time and space. Prior workhas focused on either reducing the computation time or condensingthe representation of a data cube. In this paper,we introduce Range Cubing as an efficient way to computeand compress the data cube without any loss of precision.A new data structure, range trie, is used to compress andidentify correlation in attribute values, and compress theinput dataset to effectively reduce the computational cost.The range cubing algorithm generates a compressed cube,called range cube, which partitions all cells into disjointranges. Each range represents a subset of cells with thesame aggregation value, as a tuple which has the same numberof dimensions as the input data tuples. The range cubepreserves the roll-up/drill-down semantics of a data cube.Compared to H-Cubing, experiments on real dataset showa running time of less than one thirtieth, still generating arange cube of less than one ninth of the space of the fullcube, when both algorithms run in their preferred dimensionorders. On synthetic data, range cubing demonstratesmuch better scalability, as well as higher adaptiveness toboth data sparsity and skew.