Computing Iceberg Cubes by Top-Down and Bottom-Up Integration: The StarCubing Approach

Authors:
Dong Xin;Jiawei Han;Xiaolei Li;Zheng Shao;Benjamin W. Wah
Affiliations:
IEEE;IEEE;-;-;IEEE
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2007

Citing 29
Cited 12

Implementing data cubes efficiently

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
An array-based algorithm for simultaneous multidimensional aggregates

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Quasi-cubes: exploiting approximations in multidimensional databases

ACM SIGMOD Record
Exploratory mining and pruning optimizations of constrained associations rules

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Data cube approximation and histograms via wavelets

Proceedings of the seventh international conference on Information and knowledge management
Bottom-up computation of sparse and Iceberg CUBE

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Compressed data cubes for OLAP aggregate query approximation on continuous dimensions

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient computation of Iceberg cubes with complex measures

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Dwarf: shrinking the PetaCube

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals

Data Mining and Knowledge Discovery
Cubegrades: Generalizing Association Rules

Data Mining and Knowledge Discovery
Discovery-Driven Exploration of OLAP Data Cubes

EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Index Selection for OLAP

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Selection of Views to Materialize in a Data Warehouse

ICDT '97 Proceedings of the 6th International Conference on Database Theory
Fast Computation of Sparse Datacubes

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Materialized Views Selection in a Multidimensional Database

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Materialized View Selection for Multidimensional Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
QC-trees: an efficient summary structure for semantic OLAP

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Condensed Cube: An Efficient Approach to Reducing Data Cube Size

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Efficient computation of multiple group by queries

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
C-Cubing: Efficient Computation of Closed Cubes by Aggregation-Based Checking

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Multi-dimensional regression analysis of time-series data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Quotient cube: how to summarize the semantics of a data cube

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
High-dimensional OLAP: a minimal cubing approach

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
The polynomial complexity of fully materialized coalesced cubes

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

A Probabilistic Approach for Computing Approximate Iceberg Cubes

DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Computing data cubes without redundant aggregated nodes and single graph paths: the sequential MCG approach

SBBD '08 Proceedings of the 23rd Brazilian symposium on Databases
Computing data cubes using exact sub-graph matching: the sequential MCG approach

Proceedings of the 2009 ACM symposium on Applied Computing
Emerging Cubes: Borders, size estimations and lossless reductions

Information Systems
BitCube: A Bottom-Up Cubing Engineering

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Exact and Approximate Sizes of Convex Datacubes

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Reduced representations of Emerging Cubes for OLAP database mining

International Journal of Business Intelligence and Data Mining
An efficient method for maintaining data cubes incrementally

Information Sciences: an International Journal
Double table switch: an efficient partitioning algorithm for bottom-up computation of data cubes

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
Extracting semantics in OLAP databases using emerging cubes

Information Sciences: an International Journal
ClustCube: an OLAP-based framework for clustering and mining complex database objects

Proceedings of the 2011 ACM Symposium on Applied Computing
Enhanced clustering of complex database objects in the clustcube framework

Proceedings of the fifteenth international workshop on Data warehousing and OLAP

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data cube computation is one of the most essential but expensive operations in data warehousing. Previous studies have developed two major approaches, top-down versus bottom-up. The former, represented by the MultiWay Array Cube (called the MultiWay) algorithm [30], aggregates simultaneously on multiple dimensions; however, it cannot take advantage of a priori pruning [2] when computing iceberg cubes (cubes that contain only aggregate cells whose measure values satisfy a threshold, called the iceberg condition). The latter, represented by BUC [6] , computes the iceberg cube bottom-up and facilitates a priori pruning. BUC explores fast sorting and partitioning techniques; however, it does not fully explore multidimensional simultaneous aggregation. In this paper, we present a new method, Star-Cubing, that integrates the strengths of the previous two algorithms and performs aggregations on multiple dimensions simultaneously. It utilizes a star-tree structure, extends the simultaneous aggregation methods, and enables the pruning of the group-bys that do not satisfy the iceberg condition. Our performance study shows that Star-Cubing is highly efficient and outperforms the previous methods.