New Algorithm for Computing Cube on Very Large Compressed Data Sets

Authors:
Weili Wu;Hong Gao;Jianzhong Li
Affiliations:
IEEE;-;IEEE
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2006

Citing 14
Cited 1

Data Compression in Scientific and Statistical Databases

IEEE Transactions on Software Engineering
Implementing data cubes efficiently

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
OLAP, relational, and multidimensional database systems

ACM SIGMOD Record
On computing correlated aggregates over continual data streams

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Dwarf: shrinking the PetaCube

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals

Data Mining and Knowledge Discovery
Parallelizing the Data Cube

ICDT '01 Proceedings of the 8th International Conference on Database Theory
Fast Computation of Sparse Datacubes

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Aggregation Algorithms for Very Large Compressed Data Warehouses

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Querying Multiple Features of Groups in Relational Databases

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Multi-Cube Computation

DASFAA '01 Proceedings of the 7th International Conference on Database Systems for Advanced Applications
QC-trees: an efficient summary structure for semantic OLAP

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
The polynomial complexity of fully materialized coalesced cubes

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

InfiniteDB: a pc-cluster based parallel massive database management system

Proceedings of the 2007 ACM SIGMOD international conference on Management of data

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data compression is an effective technique to improve the performance of data warehouses. Since cube operation represents the core of online analytical processing in data warehouses, it is a major challenge to develop efficient algorithms for computing cube on compressed data warehouses. To our knowledge, very few cube computation techniques have been proposed for compressed data warehouses to date in the literature. This paper presents a novel algorithm to compute cubes on compressed data warehouses. The algorithm operates directly on compressed data sets without the need of first decompressing them. The algorithm is applicable to a large class of mapping complete data compression methods. The complexity of the algorithm is analyzed in detail. The analytical and experimental results show that the algorithm is more efficient than all other existing cube algorithms. In addition, a heuristic algorithm to generate an optimal plan for computing cube is also proposed.