An overview of data warehousing and OLAP technology
ACM SIGMOD Record
The hardness of approximate optima in lattices, codes, and systems of linear equations
Journal of Computer and System Sciences - Special issue: papers from the 32nd and 34th annual symposia on foundations of computer science, Oct. 2–4, 1991 and Nov. 3–5, 1993
Wavelet-based histograms for selectivity estimation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems
Theoretical Computer Science
Discovery-Driven Exploration of OLAP Data Cubes
EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Computing Iceberg Queries Efficiently
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Explaining Differences in Multidimensional Aggregates
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Bursty and hierarchical structure in streams
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Automatically inferring patterns of resource consumption in network traffic
Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
Efficient elastic burst detection in data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Online identification of hierarchical heavy hitters: algorithms, evaluation, and applications
Proceedings of the 4th ACM SIGCOMM conference on Internet measurement
On Change Diagnosis in Evolving Data Streams
IEEE Transactions on Knowledge and Data Engineering
Space efficiency in synopsis construction algorithms
VLDB '05 Proceedings of the 31st international conference on Very large data bases
IDEAS '05 Proceedings of the 9th International Database Engineering & Application Symposium
Approximation algorithms for wavelet transform coding of data streams
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Compact histograms for hierarchical identifiers
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
The generalized MDL approach for summarization
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Finding hierarchical heavy hitters in data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Detecting change in data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Subquadratic algorithms for workload-aware haar wavelet synopses
FSTTCS '05 Proceedings of the 25th international conference on Foundations of Software Technology and Theoretical Computer Science
Hierarchical synopses with optimal error guarantees
ACM Transactions on Database Systems (TODS)
On generating near-optimal tableaux for conditional functional dependencies
Proceedings of the VLDB Endowment
Multiplicative synopses for relative-error metrics
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Towards automated performance diagnosis in a large IPTV network
Proceedings of the ACM SIGCOMM 2009 conference on Data communication
On explaining integer vectors by few homogenous segments
WADS'13 Proceedings of the 13th international conference on Algorithms and Data Structures
Hi-index | 0.00 |
Dimension attributes in data warehouses are typically hierarchical (e.g., geographic locations in sales data, URLs in Web traffic logs). OLAP tools are used to summarize the measure attributes (e.g., total sales) along a dimension hierarchy, and to characterize changes (e.g., trends and anomalies) in a hierarchical summary over time. When thenumber of changes identified is large (e.g., total sales in many stores differed from their expected values), a parsimonious explanation of the most significant changes is desirable. In this paper, we propose a natural model of parsimonious explanation, as a composition of node weights along the root-to-leaf paths in a dimension hierarchy, which permits changes to be aggregated with maximal generalization along the dimension hierarchy. We formalize this model of explaining changes in hierarchical summaries and investigate the problem of identifying optimally parsimonious explanations on arbitrary rooted one dimensional tree hierarchies. We show that such explanations can be computed efficiently in time essentially proportional to the number of leaves and the depth of the hierarchy. Further, our method can produce parsimonious explanations from the output of any statistical model that provides predictions and confidence intervals, making it widely applicable. Our experiments use real data sets to demonstrate the utility and robustness of our proposed model for explaining significant changes, as well as its superior parsimony compared to alternatives.