Quasi-cubes: exploiting approximations in multidimensional databases
ACM SIGMOD Record
Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Bottom-up computation of sparse and Iceberg CUBE
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Efficient computation of Iceberg cubes with complex measures
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Alternative Interest Measures for Mining Associations in Databases
IEEE Transactions on Knowledge and Data Engineering
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases
Proceedings of the 17th International Conference on Data Engineering
Computing Iceberg Queries Efficiently
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
SmartMiner: A Depth First Algorithm Guided by Tail Information for Mining Maximal Frequent Itemsets
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Mining Strong Affinity Association Patterns in Data Sets with Skewed Support Distribution
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Range CUBE: Efficient Cube Computation by Exploiting Data Correlation
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
MM-Cubing: Computing Iceberg Cubes by Factorizing the Lattice Space
SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
GenMax: An Efficient Algorithm for Mining Maximal Frequent Itemsets
Data Mining and Knowledge Discovery
C-Cubing: Efficient Computation of Closed Cubes by Aggregation-Based Checking
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Quotient cube: how to summarize the semantics of a data cube
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Hi-index | 0.00 |
The main idea of iceberg data cubing methods relies on optimization techniques for computing only the cuboids cells above certain minimum support threshold. Even using such approach the curse of dimensionality remains, given the large number of cuboids to compute, which produces, as we know, huge outputs. However, more recently, some efforts have been done on computing only closed cuboids. Nevertheless, for some of the dense databases, which are considered in this paper, even the set of all closed cuboids will be too large. An alternative would be to compute only the maximal cuboids. However, a pure maximal approaching implies loosing some information, this is one can generate the complete set of cuboids cells from its maximal but without their respective aggregation value. To play with some “loss of information” we need to add an interesting measure, that we call the correlated value of a cuboid cell. In this paper, we propose a new notion for reducing cuboids aggregation by means of computing only the maximal-correlated cuboids cells, and present the M3C-Cubing algorithm that brings out those cuboids. Our evaluation study shows that the method followed is a promising candidate for scalable data cubing, reducing the number of cuboids by at least an order of magnitude or more in comparison with that of closed ones.