Exact and Approximate Sizes of Convex Datacubes

Authors:
Sébastien Nedjar
Affiliations:
Laboratoire d'Informatique Fondamentale de Marseille (LIF), Aix-Marseille Université - CNRS, Marseille Cedex 9, France 13288
Venue:
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Year:
2009

Citing 13
Cited 1

Bottom-up computation of sparse and Iceberg CUBE

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Efficient mining of association rules using closed itemset lattices

Information Systems
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals

Data Mining and Knowledge Discovery
Computing iceberg concept lattices with TITANIC

Data & Knowledge Engineering
Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Pushing Convertible Constraints in Frequent Itemset Mining

Data Mining and Knowledge Discovery
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams

Distributed and Parallel Databases
C-Cubing: Efficient Computation of Closed Cubes by Aggregation-Based Checking

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Computing Iceberg Cubes by Top-Down and Bottom-Up Integration: The StarCubing Approach

IEEE Transactions on Knowledge and Data Engineering
Quotient cube: how to summarize the semantics of a data cube

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Emerging cubes for trends analysis in OLAP databases

DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Convex cube: towards a unified structure for multidimensional databases

DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications

Constrained closed datacubes

ICFCA'10 Proceedings of the 8th international conference on Formal Concept Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

In various approaches, data cubes are pre-computed in order to efficiently answer Olap queries. The notion of data cube has been explored in various ways: iceberg cubes, range cubes, differential cubes or emerging cubes. Previously, we have introduced the concept of convex cube which generalizes all the quoted variants of cubes. More precisely, the convex cube captures all the tuples satisfying a monotone and/or antimonotone constraint combination. This paper is dedicated to a study of the convex cube size. Actually, knowing the size of such a cube even before computing it has various advantages. First of all, free space can be saved for its storage and the data warehouse administration can be improved. However the main interest of this size knowledge is to choose at best the constraints to apply in order to get a workable result. For an aided calibrating of constraints, we propose a sound characterization, based on inclusion-exclusion principle, of the exact size of convex cube as long as an upper bound which can be very quickly yielded. Moreover we adapt the nearly optimal algorithm HyperLogLog in order to provide a very good approximation of the exact size of convex cubes. Our analytical results are confirmed by experiments: the approximated size of convex cubes is really close to their exact size and can be computed quasi immediately.