Implementing data cubes efficiently
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Cubetree: organization of and bulk incremental updates on the data cube
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
An array-based algorithm for simultaneous multidimensional aggregates
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
An alternative storage organization for ROLAP aggregate views based on cubetrees
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
New sampling-based summary statistics for improving approximate query answers
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Data cube approximation and histograms via wavelets
Proceedings of the seventh international conference on Information and knowledge management
Bottom-up computation of sparse and Iceberg CUBE
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Compressed data cubes for OLAP aggregate query approximation on continuous dimensions
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Congressional samples for approximate answering of group-by queries
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
CubiST: a new algorithm for improving the performance of ad-hoc OLAP queries
Proceedings of the 3rd ACM international workshop on Data warehousing and OLAP
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Selection of Views to Materialize in a Data Warehouse
ICDT '97 Proceedings of the 6th International Conference on Database Theory
Fast Computation of Sparse Datacubes
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Materialized Views Selection in a Multidimensional Database
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Materialized View Selection for Multidimensional Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Computing Iceberg Queries Efficiently
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Spreadsheets in RDBMS for OLAP
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
QC-trees: an efficient summary structure for semantic OLAP
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Attribute value reordering for efficient hybrid OLAP
DOLAP '03 Proceedings of the 6th ACM international workshop on Data warehousing and OLAP
Hierarchical dwarfs for the rollup cube
DOLAP '03 Proceedings of the 6th ACM international workshop on Data warehousing and OLAP
Range CUBE: Efficient Cube Computation by Exploiting Data Correlation
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Incremental maintenance of quotient cube for median
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Incremental maintenance of quotient cube based on Galois lattice
Journal of Computer Science and Technology
PrefixCube: prefix-sharing condensed data cube
Proceedings of the 7th ACM international workshop on Data warehousing and OLAP
On-demand data broadcasting for mobile decision making
Mobile Networks and Applications
Advanced SQL modeling in RDBMS
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Communication and Memory Optimal Parallel Data Cube Construction
IEEE Transactions on Parallel and Distributed Systems
The cgmCUBE project: Optimizing parallel data cube generation for ROLAP
Distributed and Parallel Databases
DADA: a data cube for dominant relationship analysis
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Semi-closed cube: an effective approach to trading off data cube size and query response time
Journal of Computer Science and Technology
CURE for cubes: cubing using a ROLAP engine
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
GORDIAN: efficient and scalable discovery of composite keys
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
New Algorithm for Computing Cube on Very Large Compressed Data Sets
IEEE Transactions on Knowledge and Data Engineering
Computing Iceberg Cubes by Top-Down and Bottom-Up Integration: The StarCubing Approach
IEEE Transactions on Knowledge and Data Engineering
Answering ad hoc aggregate queries from data streams using prefix aggregate trees
Knowledge and Information Systems
Quotient cube: how to summarize the semantics of a data cube
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
ROLAP implementations of the data cube
ACM Computing Surveys (CSUR)
Star-cubing: computing iceberg cubes by top-down and bottom-up integration
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
High-dimensional OLAP: a minimal cubing approach
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
The polynomial complexity of fully materialized coalesced cubes
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient computation of view subsets
Proceedings of the ACM tenth international workshop on Data warehousing and OLAP
PnP: sequential, external memory, and parallel iceberg cube computation
Distributed and Parallel Databases
Why go logarithmic if we can go linear?: Towards effective distinct counting of search traffic
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
ARCube: supporting ranking aggregate queries in partially materialized data cubes
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Supporting the data cube lifecycle: the power of ROLAP
The VLDB Journal — The International Journal on Very Large Data Bases
Hierarchical clustering for OLAP: the CUBE File approach
The VLDB Journal — The International Journal on Very Large Data Bases
A Summary Structure of Data Cube Preserving Semantics
RSEISP '07 Proceedings of the international conference on Rough Sets and Intelligent Systems Paradigms
Approximate Range-Sum Queries over Data Cubes Using Cosine Transform
DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Dwarfs in the rearview mirror: how big are they really?
Proceedings of the VLDB Endowment
Proceedings of the ACM 11th international workshop on Data warehousing and OLAP
FCLOS: A client-server architecture for mobile OLAP
Data & Knowledge Engineering
LCS-Hist: taming massive high-dimensional data cube compression
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Computing data cubes using exact sub-graph matching: the sequential MCG approach
Proceedings of the 2009 ACM symposium on Applied Computing
A Multiple Correspondence Analysis to Organize Data Cubes
Proceedings of the 2007 conference on Databases and Information Systems IV: Selected Papers from the Seventh International Baltic Conference DB&IS'2006
Data mining-based materialized view and index selection in data warehouses
Journal of Intelligent Information Systems
Closed Non Derivable Data Cubes Based on Non Derivable Minimal Generators
ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
CCBitmaps: A Space-Time Efficient Index Structure for OLAP
ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Compressing multidimensional structures: a case study
ECC'09 Proceedings of the 3rd international conference on European computing conference
An efficient method for maintaining data cubes incrementally
Information Sciences: an International Journal
Revisiting the cube lifecycle in the presence of hierarchies
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient updates for a shared nothing analytics platform
Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
Distributing the power of OLAP
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Brown dwarf: a P2P data-warehousing system
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Business intelligence for small and middle-sized entreprises
ACM SIGMOD Record
Multidimensional cyclic graph approach: Representing a data cube without common sub-graphs
Information Sciences: an International Journal
Brown Dwarf: A fully-distributed, fault-tolerant data warehousing system
Journal of Parallel and Distributed Computing
Adapting OLAP analysis to the user's interest through virtual cubes
FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery
Parallel data cubes on multi-core processors with multiple disks
Proceedings of the 2011 Conference of the Center for Advanced Studies on Collaborative Research
Computing iceberg quotient cubes with bounding
DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
An effective algorithm to extract dense sub-cubes from a large sparse cube
DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
PMC: select materialized cells in data cubes
DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
HQC: an efficient method for ROLAP with hierarchical dimensions
RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part II
An efficient indexing technique for computing high dimensional data cubes
WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Clustering-based materialized view selection in data warehouses
ADBIS'06 Proceedings of the 10th East European conference on Advances in Databases and Information Systems
Attribute value reordering for efficient hybrid OLAP
Information Sciences: an International Journal
Top-k interesting phrase mining in ad-hoc collections using sequence pattern indexing
Proceedings of the 15th International Conference on Extending Database Technology
Towards a scalable, performance-oriented OLAP storage engine
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part II
A hilbert space compression architecture for data warehouse environments
DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
A clustered Dwarf structure to speed up queries on data cubes
DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Optimizing OLAP cube processing on solid state drives
Proceedings of the sixteenth international workshop on Data warehousing and OLAP
Efficient error-tolerant query autocompletion
Proceedings of the VLDB Endowment
Topological XML data cube construction
International Journal of Web Engineering and Technology
Hi-index | 0.00 |
Dwarf is a highly compressed structure for computing, storing, and querying data cubes. Dwarf identifies prefix and suffix structural redundancies and factors them out by coalescing their store. Prefix redundancy is high on dense areas of cubes but suffix redundancy is significantly higher for sparse areas. Putting the two together fuses the exponential sizes of high dimensional full cubes into a dramatically condensed data structure. The elimination of suffix redundancy has an equally dramatic reduction in the computation of the cube because recomputation of the redundant suffixes is avoided. This effect is multiplied in the presence of correlation amongst attributes in the cube. A Petabyte 25-dimensional cube was shrunk this way to a 2.3GB Dwarf Cube, in less than 20 minutes, a 1:400000 storage reduction ratio. Still, Dwarf provides 100% precision on cube queries and is a self-sufficient structure which requires no access to the fact table. What makes Dwarf practical is the automatic discovery,in a single pass over the fact table, of the prefix and suffix redundancies without user involvement or knowledge of the value distributions.This paper describes the Dwarf structure and the Dwarf cube construction algorithm. Further optimizations are then introduced for improving clustering and query performance. Experiments with the current implementation include comparisons on detailed measurements with real and synthetic datasets against previously published techniques. The comparisons show that Dwarfs by far out-perform these techniques on all counts: storage space, creation time, query response time, and updates of cubes.