The Multi-Tree Cubing algorithm for computing iceberg cubes

Authors:
Xing Li;Howard J. Hamilton;Kamran Karimi;Liqiang Geng
Affiliations:
Department of Computer Science, University of Regina, Regina, Canada S4S 0A2;Department of Computer Science, University of Regina, Regina, Canada S4S 0A2;Department of Computer Science, University of Regina, Regina, Canada S4S 0A2;Department of Computer Science, University of Regina, Regina, Canada S4S 0A2
Venue:
Journal of Intelligent Information Systems
Year:
2009

Citing 15
Cited 0

An array-based algorithm for simultaneous multidimensional aggregates

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Advances in knowledge discovery and data mining

Advances in knowledge discovery and data mining
Bottom-up computation of sparse and Iceberg CUBE

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Efficient computation of Iceberg cubes with complex measures

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Data Mining Techniques: For Marketing, Sales, and Customer Support

Data Mining Techniques: For Marketing, Sales, and Customer Support
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals

Data Mining and Knowledge Discovery
Fast Computation of Sparse Datacubes

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Efficiently computing the top N averages in iceberg cubes

ACSC '03 Proceedings of the 26th Australasian computer science conference - Volume 16
MM-Cubing: Computing Iceberg Cubes by Factorizing the Lattice Space

SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Divide-and-Approximate: A Novel Constraint Push Strategy for Iceberg Cube Mining

IEEE Transactions on Knowledge and Data Engineering
PnP: Parallel and External Memory Iceberg Cube Computation

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Iceberg-cube algorithms: An empirical evaluation on synthetic and real data

Intelligent Data Analysis
Star-cubing: computing iceberg cubes by top-down and bottom-up integration

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29

Quantified Score

Hi-index	0.00

Visualization

Abstract

The computation of data cubes is one of the most expensive operations in on-line analytical processing (OLAP). To improve efficiency, an iceberg cube represents only the cells whose aggregate values are above a given threshold (minimum support). Top-down and bottom-up approaches are used to compute the iceberg cube for a data set, but both have performance limitations. In this paper, a new algorithm, called Multi-Tree Cubing (MTC), is proposed for computing an iceberg cube. The Multi-Tree Cubing algorithm is an integrated top-down and bottom-up approach. Overall control is handled in a top-down manner, so MTC features shared computation. By processing the orderings in the opposite order from the Top-Down Computation algorithm, the MTC algorithm is able to prune attributes. The Bottom Up Computation (BUC) algorithm and its variations also perform pruning by relying on the processing of intermediate partitions. The MTC algorithm, however, prunes without processing such partitions. The MTC algorithm is based on a specialized type of prefix tree data structure, called an Attribute---Partition tree (AP-tree), consisting of attribute and partition nodes. The AP-tree facilitates fast, in-memory sorting and APRIORI-like pruning. We report on five series of experiments, which confirm that MTC is consistently as fast or faster than BUC, while finding the same iceberg cubes.