Implementing data cubes efficiently
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
An array-based algorithm for simultaneous multidimensional aggregates
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Bottom-up computation of sparse and Iceberg CUBE
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Efficient computation of Iceberg cubes with complex measures
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals
Data Mining and Knowledge Discovery
Discovery-Driven Exploration of OLAP Data Cubes
EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Fast Computation of Sparse Datacubes
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Condensed Cube: An Efficient Approach to Reducing Data Cube Size
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
MM-Cubing: Computing Iceberg Cubes by Factorizing the Lattice Space
SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Mining Constrained Gradients in Large Databases
IEEE Transactions on Knowledge and Data Engineering
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams
Distributed and Parallel Databases
C-Cubing: Efficient Computation of Closed Cubes by Aggregation-Based Checking
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Regression Cubes with Lossless Compression and Aggregation
IEEE Transactions on Knowledge and Data Engineering
Database in Depth: Relational Theory for Practitioners
Database in Depth: Relational Theory for Practitioners
Efficient approaches for materialized views selection in a data warehouse
Information Sciences: an International Journal
Data warehouse enhancement: A semantic cube model approach
Information Sciences: an International Journal
Progressive ranking of range aggregates
Data & Knowledge Engineering
Quotient cube: how to summarize the semantics of a data cube
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Star-cubing: computing iceberg cubes by top-down and bottom-up integration
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
High-dimensional OLAP: a minimal cubing approach
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
OLAP over imprecise data with domain constraints
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
ARCube: supporting ranking aggregate queries in partially materialized data cubes
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Sampling cube: a framework for statistical olap over sampling data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Supporting OLAP operations over imperfectly integrated taxonomies
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
SBBD '08 Proceedings of the 23rd Brazilian symposium on Databases
Graph OLAP: Towards Online Analytical Processing on Graphs
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Computing data cubes using exact sub-graph matching: the sequential MCG approach
Proceedings of the 2009 ACM symposium on Applied Computing
Emerging Cubes: Borders, size estimations and lossless reductions
Information Systems
P-Cube: Answering Preference Queries in Multi-Dimensional Space
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Hi-index | 0.07 |
We present a new full cube computation technique and a cube storage representation approach, called the multidimensional cyclic graph (MCG) approach. The data cube relational operator has exponential complexity and therefore its materialization involves both a huge amount of memory and a substantial amount of time. Reducing the size of data cubes, without a loss of generality, thus becomes a fundamental problem. Previous approaches, such as Dwarf, Star and MDAG, have substantially reduced the cube size using graph representations. In general, they eliminate prefix redundancy and some suffix redundancy from a data cube. The MCG differs significantly from previous approaches as it completely eliminates prefix and suffix redundancies from a data cube. A data cube can be viewed as a set of sub-graphs. In general, redundant sub-graphs are quite common in a data cube, but eliminating them is a hard problem. Dwarf, Star and MDAG approaches only eliminate some specific common sub-graphs. The MCG approach efficiently eliminates all common sub-graphs from the entire cube, based on an exact sub-graph matching solution. We propose a matching function to guarantee one-to-one mapping between sub-graphs. The function is computed incrementally, in a top-down fashion, and its computation uses a minimal amount of information to generate unique results. In addition, it is computed for any measurement type: distributive, algebraic or holistic. MCG performance analysis demonstrates that MCG is 20-40% faster than Dwarf, Star and MDAG approaches when computing sparse data cubes. Dense data cubes have a small number of aggregations, so there is not enough room for runtime and memory consumption optimization, therefore the MCG approach is not useful in computing such dense cubes. The compact representation of sparse data cubes enables the MCG approach to reduce memory consumption by 70-90% when compared to the original Star approach, proposed in [33]. In the same scenarios, the improved Star approach, proposed in [34], reduces memory consumption by only 10-30%, Dwarf by 30-50% and MDAG by 40-60%, when compared to the original Star approach. The MCG is the first approach that uses an exact sub-graph matching function to reduce cube size, avoiding unnecessary aggregation, i.e. improving cube computation runtime.