Efficient computation of Iceberg cubes with complex measures

Authors:
Jiawei Han;Jian Pei;Guozhu Dong;Ke Wang
Affiliations:
School of Computing Science, Simon Fraser University, B.C., Canada;School of Computing Science, Simon Fraser University, B.C., Canada;Department of Computer Science, Wright State University, Dayton, OH;School of Computing Science, Simon Fraser University, B.C., Canada
Venue:
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Year:
2001

Citing 13
Cited 81

Implementing data cubes efficiently

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
An overview of data warehousing and OLAP technology

ACM SIGMOD Record
An array-based algorithm for simultaneous multidimensional aggregates

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Exploratory mining and pruning optimizations of constrained associations rules

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Optimization of constrained frequent set queries with 2-variable constraints

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Bottom-up computation of sparse and Iceberg CUBE

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Can we push more constraints into frequent pattern mining?

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals

Data Mining and Knowledge Discovery
Fast Computation of Sparse Datacubes

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Computing Iceberg Queries Efficiently

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases

Multi-dimensional sequential pattern mining

Proceedings of the tenth international conference on Information and knowledge management
Scalable frequent-pattern mining methods: an overview

Tutorial notes of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
CubeExplorer: online exploration of data cubes

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Mining Multi-Dimensional Constrained Gradients in Data Cubes

Proceedings of the 27th International Conference on Very Large Data Bases
Computing Full and Iceberg Datacubes Using Partitions

ISMIS '02 Proceedings of the 13th International Symposium on Foundations of Intelligent Systems
Efficiently computing the top N averages in iceberg cubes

ACSC '03 Proceedings of the 26th Australasian computer science conference - Volume 16
Extracting semantics from data cubes using cube transversals and closures

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Range CUBE: Efficient Cube Computation by Exploiting Data Correlation

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Vertical and horizontal percentage aggregations

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
MAIDS: mining alarming incidents from data streams

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Mining Constrained Gradients in Large Databases

IEEE Transactions on Knowledge and Data Engineering
From sequential pattern mining to structured pattern mining: a pattern-growth approach

Journal of Computer Science and Technology
Finding the most interesting correlations in a database: how hard can it be?

Information Systems
Divide-and-Approximate: A Novel Constraint Push Strategy for Iceberg Cube Mining

IEEE Transactions on Knowledge and Data Engineering
Finding (Recently) Frequent Items in Distributed Data Streams

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Communication and Memory Optimal Parallel Data Cube Construction

IEEE Transactions on Parallel and Distributed Systems
Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams

Distributed and Parallel Databases
Synchronization Options for Data Warehouse Designs

Computer
Supporting ad-hoc ranking aggregates

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
CURE for cubes: cubing using a ROLAP engine

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Bellwether analysis: predicting global aggregates from local regions

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient incremental maintenance of data cubes

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Flowcube: constructing RFID flowcubes for multi-dimensional analysis of commodity flows

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Regression Cubes with Lossless Compression and Aggregation

IEEE Transactions on Knowledge and Data Engineering
Towards multidimensional subspace skyline analysis

ACM Transactions on Database Systems (TODS)
Computing Iceberg Cubes by Top-Down and Bottom-Up Integration: The StarCubing Approach

IEEE Transactions on Knowledge and Data Engineering
Answering XML queries by means of data summaries

ACM Transactions on Information Systems (TOIS)
Progressive ranking of range aggregates

Data & Knowledge Engineering
Efficient Computation of Iceberg Cubes by Bounding Aggregate Functions

IEEE Transactions on Knowledge and Data Engineering
Answering ad hoc aggregate queries from data streams using prefix aggregate trees

Knowledge and Information Systems
Multi-dimensional regression analysis of time-series data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
ROLAP implementations of the data cube

ACM Computing Surveys (CSUR)
Iceberg-cube algorithms: An empirical evaluation on synthetic and real data

Intelligent Data Analysis
COMBI-operator - database support for data mining applications

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Star-cubing: computing iceberg cubes by top-down and bottom-up integration

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
High-dimensional OLAP: a minimal cubing approach

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
ARCube: supporting ranking aggregate queries in partially materialized data cubes

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Finding frequent items in probabilistic data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Supporting the data cube lifecycle: the power of ROLAP

The VLDB Journal — The International Journal on Very Large Data Bases
A Probabilistic Approach for Computing Approximate Iceberg Cubes

DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Bellwether analysis: Searching for cost-effective query-defined predictors in large databases

ACM Transactions on Knowledge Discovery from Data (TKDD)
Computing data cubes without redundant aggregated nodes and single graph paths: the sequential MCG approach

SBBD '08 Proceedings of the 23rd Brazilian symposium on Databases
On-line evaluation of a data cube over a data stream

ACS'08 Proceedings of the 8th conference on Applied computer scince
Efficiently tracing clusters over high-dimensional on-line data streams

Data & Knowledge Engineering
Answering aggregate keyword queries on relational databases using minimal group-bys

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Computing data cubes using exact sub-graph matching: the sequential MCG approach

Proceedings of the 2009 ACM symposium on Applied Computing
Incremental Computation for MEDIAN Cubes in What-If Analysis

APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Space-optimal heavy hitters with strong error bounds

Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
The Multi-Tree Cubing algorithm for computing iceberg cubes

Journal of Intelligent Information Systems
Closed Non Derivable Data Cubes Based on Non Derivable Minimal Generators

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Mining significant change patterns in multidimensional spaces

International Journal of Business Intelligence and Data Mining
DS-Cuber: an integrated OLAP environment for data streams

Proceedings of the 18th ACM conference on Information and knowledge management
Strategies for complex data cube queries

Applied Intelligence
An efficient method for maintaining data cubes incrementally

Information Sciences: an International Journal
Efficiently computing iceberg cubes with complex constraints through bounding

PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
PHC: a rapid parallel hierarchical cubing algorithm on high dimensional OLAP

ICA3PP'07 Proceedings of the 7th international conference on Algorithms and architectures for parallel processing
Revisiting the cube lifecycle in the presence of hierarchies

The VLDB Journal — The International Journal on Very Large Data Bases
A secure multiparty computation privacy preserving OLAP framework over distributed XML data

Proceedings of the 2010 ACM Symposium on Applied Computing
Finding frequent elements in non-bursty streams

ESA'07 Proceedings of the 15th annual European conference on Algorithms
A high performance hierarchical cubing algorithm and efficient OLAP in high-dimensional data warehouse

PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
Fast Manhattan sketches in data streams

Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Space-optimal heavy hitters with strong error bounds

ACM Transactions on Database Systems (TODS)
Balancing accuracy and privacy of OLAP aggregations on data cubes

DOLAP '10 Proceedings of the ACM 13th international workshop on Data warehousing and OLAP
Distributed frequent items detection on uncertain data

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Double table switch: an efficient partitioning algorithm for bottom-up computation of data cubes

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
Extracting semantics in OLAP databases using emerging cubes

Information Sciences: an International Journal
Multidimensional cyclic graph approach: Representing a data cube without common sub-graphs

Information Sciences: an International Journal
Latent OLAP: data cubes over latent variables

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Privacy Preserving OLAP over Distributed XML Data: A Theoretically-Sound Secure-Multiparty-Computation Approach

Journal of Computer and System Sciences
A parallel and distributed method for computing high dimensional MOLAP

NPC'05 Proceedings of the 2005 IFIP international conference on Network and Parallel Computing
Multiway iceberg cubing on trees

WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering
Computing iceberg quotient cubes with bounding

DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
On the computation of maximal-correlated cuboids cells

DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
Efficient computation of multi-feature data cubes

KSEM'06 Proceedings of the First international conference on Knowledge Science, Engineering and Management
An efficient indexing technique for computing high dimensional data cubes

WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Computing high dimensional MOLAP with parallel shell mini-cubes

FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part I
Multiway pruning for efficient iceberg cubing

DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Lossless reduction of datacubes

DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Towards a theory for privacy preserving distributed OLAP

Proceedings of the 2012 Joint EDBT/ICDT Workshops
Mining top-K multidimensional gradients

DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is often too expensive to compute and materialize a complete high-dimensional data cube. Computing an iceberg cube, which contains only aggregates above certain thresholds, is an effective way to derive nontrivial multi-dimensional aggregations for OLAP and data mining.In this paper, we study efficient methods for computing iceberg cubes with some popularly used complex measures, such as average, and develop a methodology that adopts a weaker but anti-monotonic condition for testing and pruning search space. In particular, for efficient computation of iceberg cubes with the average measure, we propose a top-k average pruning method and extend two previously studied methods, Apriori and BUC, to Top-k Apriori and Top-k BUC. To further improve the performance, an interesting hypertree structure, called H-tree, is designed and a new iceberg cubing method, called Top-k H-Cubing, is developed. Our performance study shows that Top-k BUC and Top-k H-Cubing are two promising candidates for scalable computation, and Top-k H-Cubing has better performance in most cases.