Probabilistic counting algorithms for data base applications
Journal of Computer and System Sciences
Implementing data cubes efficiently
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
OLAP solutions: building multidimensional information systems
OLAP solutions: building multidimensional information systems
Database modeling and design (3rd ed.)
Database modeling and design (3rd ed.)
A methodological framework for data warehouse design
Proceedings of the 1st ACM international workshop on Data warehousing and OLAP
Multi-dimensional selectivity estimation using compressed histogram information
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Approximating block accesses in database organizations
Communications of the ACM
Analysis and performance of inverted data base structures
Communications of the ACM
Database System Concepts
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Selection of Views to Materialize in a Data Warehouse
ICDT '97 Proceedings of the 6th International Conference on Database Theory
Materialized Views Selection in a Multidimensional Database
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Materialized View Selection for Multidimensional Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Sampling-Based Estimation of the Number of Distinct Values of an Attribute
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Selectivity Estimation Without the Attribute Value Independence Assumption
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A Pareto model for OLAP view size estimation
CASCON '01 Proceedings of the 2001 conference of the Centre for Advanced Studies on Collaborative research
Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more
Hi-index | 0.00 |
On-line analytical processing (OLAP) is an important technique for analyzing data in decision support systems. Most analytical queries require aggregation of the interesting data. Pre-aggregation is one of the most important techniques used to speed up the query response time. However, precomputing every aggregate takes a large amount of time and space. The decision of which aggregates should be precomputed and how much space is required is thus important. By estimating the storage space required for each aggregate view, we can allocate the space for aggregates efficienlty and decide which aggregates to precompute. We investigate four existing strategies for this problem: two based on mathematical approximations, one based on sampling, and one hybrid approach based on mathematical approximation and sampling. We propose a new hybrid strategy that is based on mathematical approximation and sampling and is easy to compute. We evaluate the accuracy of these algorithms in estimating the storage explosion due to aggregation for different data distributions and data densities. The result indicate that our proposed strategy approximates the explosion more accurately then other strategies.