A view selection algorithm with performance guarantee
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Sorting improves word-aligned bitmap indexes
Data & Knowledge Engineering
Reordering columns for smaller indexes
Information Sciences: an International Journal
On power-law distributed balls in bins and its applications to view size estimation
ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
Reordering rows for better compression: Beyond the lexicographic order
ACM Transactions on Database Systems (TODS)
Proceedings of the 16th International Conference on Extending Database Technology
Hi-index | 0.00 |
A data warehouse cannot materialize all possible views, hence we must estimate quickly, accurately, and reliably the size of views to determine the best candidates for materialization. Many available techniques for view-size estimation make particular statistical assumptions and their error can be large. Comparatively, unassuming probabilistic techniques are slower, but they estimate accurately and reliability very large view sizes using little memory. We compare five unassuming hashing-based view-size estimation techniques including Stochastic Probabilistic Counting and LogLog Probabilistic Counting. Our experiments show that only Generalized Counting, Gibbons-Tirthapura, and Adaptive Counting provide universally tight estimates irrespective of the sizeof the view; of those, only Adaptive Counting remains constantly fast as we increasethe memory budget.