Star-cubing: computing iceberg cubes by top-down and bottom-up integration

Authors:
Dong Xin;Jiawei Han;Xiaolei Li;Benjamin W. Wah
Affiliations:
University of Illinois at Urbana-Champaign, Urbana, IL;University of Illinois at Urbana-Champaign, Urbana, IL;University of Illinois at Urbana-Champaign, Urbana, IL;University of Illinois at Urbana-Champaign, Urbana, IL
Venue:
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Year:
2003

Citing 22
Cited 45

Implementing data cubes efficiently

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
An array-based algorithm for simultaneous multidimensional aggregates

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Quasi-cubes: exploiting approximations in multidimensional databases

ACM SIGMOD Record
Exploratory mining and pruning optimizations of constrained associations rules

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Data cube approximation and histograms via wavelets

Proceedings of the seventh international conference on Information and knowledge management
Bottom-up computation of sparse and Iceberg CUBE

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Compressed data cubes for OLAP aggregate query approximation on continuous dimensions

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient computation of Iceberg cubes with complex measures

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Dwarf: shrinking the PetaCube

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals

Data Mining and Knowledge Discovery
Cubegrades: Generalizing Association Rules

Data Mining and Knowledge Discovery
Discovery-Driven Exploration of OLAP Data Cubes

EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Index Selection for OLAP

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Selection of Views to Materialize in a Data Warehouse

ICDT '97 Proceedings of the 6th International Conference on Database Theory
Fast Computation of Sparse Datacubes

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Materialized Views Selection in a Multidimensional Database

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Materialized View Selection for Multidimensional Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
QC-trees: an efficient summary structure for semantic OLAP

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Multi-dimensional regression analysis of time-series data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Quotient cube: how to summarize the semantics of a data cube

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases

Range CUBE: Efficient Cube Computation by Exploiting Data Correlation

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
PnP: Parallel and External Memory Iceberg Cube Computation

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Efficient computation of the skyline cube

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Fast Algorithms for Frequent Itemset Mining Using FP-Trees

IEEE Transactions on Knowledge and Data Engineering
Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams

Distributed and Parallel Databases
The cgmCUBE project: Optimizing parallel data cube generation for ROLAP

Distributed and Parallel Databases
CURE for cubes: cubing using a ROLAP engine

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Flowcube: constructing RFID flowcubes for multi-dimensional analysis of commodity flows

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Towards multidimensional subspace skyline analysis

ACM Transactions on Database Systems (TODS)
BIwTL: a business information warehouse toolkit and language for warehousing simplification and automation

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Answering XML queries by means of data summaries

ACM Transactions on Information Systems (TOIS)
Efficient Computation of Iceberg Cubes by Bounding Aggregate Functions

IEEE Transactions on Knowledge and Data Engineering
ROLAP implementations of the data cube

ACM Computing Surveys (CSUR)
High-dimensional OLAP: a minimal cubing approach

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Ix-cubes: iceberg cubes for data warehousing and olap on xml data

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
DataScope: viewing database contents in Google Maps' way

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
PnP: sequential, external memory, and parallel iceberg cube computation

Distributed and Parallel Databases
ARCube: supporting ranking aggregate queries in partially materialized data cubes

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Sampling cube: a framework for statistical olap over sampling data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Supporting the data cube lifecycle: the power of ROLAP

The VLDB Journal — The International Journal on Very Large Data Bases
Pruning attribute values from data cubes with diamond dicing

IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
Bellwether analysis: Searching for cost-effective query-defined predictors in large databases

ACM Transactions on Knowledge Discovery from Data (TKDD)
On-line evaluation of a data cube over a data stream

ACS'08 Proceedings of the 8th conference on Applied computer scince
Answering aggregate keyword queries on relational databases using minimal group-bys

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Computing data cubes using exact sub-graph matching: the sequential MCG approach

Proceedings of the 2009 ACM symposium on Applied Computing
Fast and dynamic OLAP exploration using UDFs

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
What Can Formal Concept Analysis Do for Data Warehouses?

ICFCA '09 Proceedings of the 7th International Conference on Formal Concept Analysis
The Multi-Tree Cubing algorithm for computing iceberg cubes

Journal of Intelligent Information Systems
Compressing multidimensional structures: a case study

ECC'09 Proceedings of the 3rd international conference on European computing conference
PHC: a rapid parallel hierarchical cubing algorithm on high dimensional OLAP

ICA3PP'07 Proceedings of the 7th international conference on Algorithms and architectures for parallel processing
Revisiting the cube lifecycle in the presence of hierarchies

The VLDB Journal — The International Journal on Very Large Data Bases
A high performance hierarchical cubing algorithm and efficient OLAP in high-dimensional data warehouse

PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
Multidimensional cyclic graph approach: Representing a data cube without common sub-graphs

Information Sciences: an International Journal
Latent OLAP: data cubes over latent variables

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
TEXplorer: keyword-based object search and exploration in multidimensional text databases

Proceedings of the 20th ACM international conference on Information and knowledge management
Parallel data cubes on multi-core processors with multiple disks

Proceedings of the 2011 Conference of the Center for Advanced Studies on Collaborative Research
Ag-Tree: a novel structure for range queries in data warehouse environments

DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
A parallel and distributed method for computing high dimensional MOLAP

NPC'05 Proceedings of the 2005 IFIP international conference on Network and Parallel Computing
Multiway iceberg cubing on trees

WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering
Computing iceberg quotient cubes with bounding

DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
An efficient indexing technique for computing high dimensional data cubes

WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Computing high dimensional MOLAP with parallel shell mini-cubes

FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part I
A clustered Dwarf structure to speed up queries on data cubes

DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Collective cubing platform towards definition and analysis of warehouse cubes

ICCCI'12 Proceedings of the 4th international conference on Computational Collective Intelligence: technologies and applications - Volume Part II
Constrained Cube Lattices for Multidimensional Database Mining

International Journal of Data Warehousing and Mining

Quantified Score

Hi-index	0.01

Visualization

Abstract

Data cube computation is one of the most essential but expensive operations in data warehousing. Previous studies have developed two major approaches, top-down vs. bottom-up. The former, represented by the Multi-Way Array Cube (called MultiWay) algorithm [25], aggregates simultaneously on multiple dimensions; however, it cannot take advantage of Apriori pruning [2] when computing iceberg cubes (cubes that contain only aggregate cells whose measure value satisfies a threshold, called iceberg condition). The latter, represented by two algorithms: BUC [6] and H-Cubing[11], computes the iceberg cube bottom-up and facilitates Apriori pruning. BUC explores fast sorting and partitioning techniques; whereas H-Cubing explores a data structure, H-Tree, for shared computation. However, none of them fully explores multi-dimensional simultaneous aggregation. In this paper, we present a new method, Star-Cubing, that integrates the strengths of the previous three algorithms and performs aggregations on multiple dimensions simultaneously. It utilizes a star-tree structure, extends the simultaneous aggregation methods, and enables the pruning of the group-by's that do not satisfy the iceberg condition. Our performance study shows that Star-Cubing is highly efficient and outperforms all the previous methods in almost all kinds of data distributions.