Probabilistic counting algorithms for data base applications
Journal of Computer and System Sciences
Introduction to algorithms
Implementing data cubes efficiently
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Cubetree: organization of and bulk incremental updates on the data cube
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
An array-based algorithm for simultaneous multidimensional aggregates
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
High performance multidimensional analysis of large datasets
Proceedings of the 1st ACM international workshop on Data warehousing and OLAP
Bottom-up computation of sparse and Iceberg CUBE
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
A dynamic load balancing strategy for parallel datacube computation
Proceedings of the 2nd ACM international workshop on Data warehousing and OLAP
A Shifting Algorithm for Min-Max Tree Partitioning
Journal of the ACM (JACM)
Data mining: concepts and techniques
Data mining: concepts and techniques
Iceberg-cube computation with PC clusters
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Distributed and Parallel Databases - Special issue: Parallel and distributed data mining
High Performance OLAP and Data Mining on Parallel Computers
Data Mining and Knowledge Discovery
Fully Dynamic Partitioning: Handling Data Skew in Parallel Data Cube Computation
Distributed and Parallel Databases
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Fast Computation of Sparse Datacubes
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Sampling-Based Estimation of the Number of Distinct Values of an Attribute
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Parallel Multi-Dimensional ROLAP Indexing
CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
A Parallel Scalable Infrastructure for OLAP and Data Mining
IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
QC-trees: an efficient summary structure for semantic OLAP
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
CCGRID '02 Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid
Condensed Cube: An Efficient Approach to Reducing Data Cube Size
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Parallel ROLAP Data Cube Construction on Shared-Nothing Multiprocessors
Distributed and Parallel Databases
Parallel relational olap
MM-Cubing: Computing Iceberg Cubes by Factorizing the Lattice Space
SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Quotient cube: how to summarize the semantics of a data cube
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Star-cubing: computing iceberg cubes by top-down and bottom-up integration
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Parallel querying of ROLAP cubes in the presence of hierarchies
Proceedings of the 8th ACM international workshop on Data warehousing and OLAP
Mapgraph: efficient methods for complex olap hierarchies
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Pruning attribute values from data cubes with diamond dicing
IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
Cooperative caching for grid-enabled OLAP
International Journal of Grid and Utility Computing
LCS-Hist: taming massive high-dimensional data cube compression
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Enabling OLAP in mobile environments via intelligent data cube compression techniques
Journal of Intelligent Information Systems
Parallel OLAP with the Sidera server
Future Generation Computer Systems
Sidera: a cluster-based server for online analytical processing
OTM'07 Proceedings of the 2007 OTM confederated international conference on On the move to meaningful internet systems: CoopIS, DOA, ODBASE, GADA, and IS - Volume Part II
Exploring graphics processing units as parallel coprocessors for online aggregation
DOLAP '10 Proceedings of the ACM 13th international workshop on Data warehousing and OLAP
Real-time computation of advanced rules in OLAP databases
ADBIS'11 Proceedings of the 15th international conference on Advances in databases and information systems
Efficient distributed parallel top-down computation of ROLAP data cube using mapreduce
DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery
A New Parallel Data Cube Construction Scheme
International Journal of Grid and High Performance Computing
Data warehousing and OLAP over big data: current challenges and future research directions
Proceedings of the sixteenth international workshop on Data warehousing and OLAP
Proceedings of the 17th International Database Engineering & Applications Symposium
Hi-index | 0.00 |
On-line Analytical Processing (OLAP) has become one of the most powerful and prominent technologies for knowledge discovery in VLDB (Very Large Database) environments. Central to the OLAP paradigm is the data cube, a multi-dimensional hierarchy of aggregate values that provides a rich analytical model for decision support. Various sequential algorithms for the efficient generation of the data cube have appeared in the literature. However, given the size of contemporary data warehousing repositories, multi-processor solutions are crucial for the massive computational demands of current and future OLAP systems.In this paper we discuss the cgmCUBE Project, a multi-year effort to design and implement a multi-processor platform for data cube generation that targets the relational database model (ROLAP). More specifically, we discuss new algorithmic and system optimizations relating to (1) a thorough optimization of the underlying sequential cube construction method and (2) a detailed and carefully engineered cost model for improved parallel load balancing and faster sequential cube construction. These optimizations were key in allowing us to build a prototype that is able to produce data cube output at a rate of over one TeraByte per hour.