Handbook of theoretical computer science (vol. A): algorithms and complexity
Handbook of theoretical computer science (vol. A): algorithms and complexity
A catalog of complexity classes
Handbook of theoretical computer science (vol. A)
Handbook of theoretical computer science (vol. A)
Network flows: theory, algorithms, and applications
Network flows: theory, algorithms, and applications
An array-based algorithm for simultaneous multidimensional aggregates
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Approximate computation of multidimensional aggregates of sparse data using wavelets
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
An Efficient Implementation of Edmonds' Algorithm for Maximum Matching on Graphs
Journal of the ACM (JACM)
Towards the building of a dense-region-based OLAP system
Data & Knowledge Engineering
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Block-Oriented Compression Techniques for Large Statistical Databases
IEEE Transactions on Knowledge and Data Engineering
Efficient Aggregation Algorithms for Compressed Data Warehouses
IEEE Transactions on Knowledge and Data Engineering
Coarse Grained Parallel On-Line Analytical Processing (OLAP) for Data Mining
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Efficient Organization of Large Multidimensional Arrays
Proceedings of the Tenth International Conference on Data Engineering
Using Loglinear Models to Compress Datacube
WAIM '00 Proceedings of the First International Conference on Web-Age Information Management
DROLAP - A Dense-Region Based Approach to On-Line Analytical Processing
DEXA '99 Proceedings of the 10th International Conference on Database and Expert Systems Applications
pCube: Update-Efficient Online Aggregation with Progressive Feedback and Error Bounds
SSDBM '00 Proceedings of the 12th International Conference on Scientific and Statistical Database Management
High-performance on-line analytical processing and data mining on parallel computers
High-performance on-line analytical processing and data mining on parallel computers
Attribute value reordering for efficient hybrid OLAP
DOLAP '03 Proceedings of the 6th ACM international workshop on Data warehousing and OLAP
Compressing arrays by ordering attribute values
Information Processing Letters
Data warehouse enhancement: A semantic cube model approach
Information Sciences: an International Journal
Information Sciences: an International Journal
Reordering columns for smaller indexes
Information Sciences: an International Journal
Hi-index | 0.07 |
The normalization of a data cube is the ordering of the attribute values. For large multidimensional arrays where dense and sparse chunks are stored differently, proper normalization can lead to improved storage efficiency. We show that it is NP-hard to compute an optimal normalization even for 1x3 chunks, although we find an exact algorithm for 1x2 chunks. When dimensions are nearly statistically independent, we show that dimension-wise attribute frequency sorting is an optimal normalization and takes time O(dnlog(n)) for data cubes of size n^d. When dimensions are not independent, we propose and evaluate a several heuristics. The hybrid OLAP (HOLAP) storage mechanism is already 19-30% more efficient than ROLAP, but normalization can improve it further by 9-13% for a total gain of 29-44% over ROLAP.