Implementing data cubes efficiently
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
An array-based algorithm for simultaneous multidimensional aggregates
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Approximating clique and biclique problems
Journal of Algorithms
Approximate computation of multidimensional aggregates of sparse data using wavelets
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals
Data Mining and Knowledge Discovery
Explaining Differences in Multidimensional Aggregates
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Intelligent Rollups in Multidimensional OLAP Data
Proceedings of the 27th International Conference on Very Large Data Bases
Concise descriptions of subsets of structured sets
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
The generalized MDL approach for summarization
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Quotient cube: how to summarize the semantics of a data cube
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Compact histograms for hierarchical identifiers
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Inapproximability of maximum weighted edge biclique and its applications
TAMC'08 Proceedings of the 5th international conference on Theory and applications of models of computation
CHIRP: a new classifier based on composite hypercubes on iterated random projections
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
RELIN: relatedness and informativeness-based centrality for entity summarization
ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on the Best of SIGKDD 2011
Hi-index | 0.00 |
Summarization of query results is an important problem for many OLAP applications. The Minimum Description Length principle has been applied in various studies to provide summaries. In this paper, we consider a new approach of applying the MDL principle. We study the problem of finding summaries of the form S Θ H for k-d cubes with tree hierarchies. The S part generalizes the query results, while the H part describes all the exceptions to the generalizations. The optimization problem is to minimize the combined cardinalities of S and H. We first characterize the problem by showing that solving the 1-d problem can be done in time linear to the size of hierarchy, but solving the 2-d problem is NP-hard. We then develop three different heuristics, based on a greedy approach, a dynamic programming approach and a quadratic programming approach. We conduct a comprehensive experimental evaluation. Both the dynamic programming algorithm and the greedy algorithm can be used for different circumstances. Both produce summaries that are significantly shorter than those generated by state-of-the-art alternatives.