Approximate computation of multidimensional aggregates of sparse data using wavelets
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Join synopses for approximate query answering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Compressed data cubes for OLAP aggregate query approximation on continuous dimensions
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: concepts and techniques
Data mining: concepts and techniques
SPARTAN: a model-based semantic compression system for massive data tables
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Efficient Organization of Large Multidimensional Arrays
Proceedings of the Tenth International Conference on Data Engineering
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Aggregation Algorithms for Very Large Compressed Data Warehouses
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Semantic Compression and Pattern Extraction with Fascicles
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
A probabilistic model for data cube compression and query approximation
Proceedings of the ACM tenth international workshop on Data warehousing and OLAP
What Can Formal Concept Analysis Do for Data Warehouses?
ICFCA '09 Proceedings of the 7th International Conference on Formal Concept Analysis
Hi-index | 0.00 |
Approximate query processing has emerged as an approach to dealing with the huge data volume and complex queries in the environment of data warehouse. In this paper, we present a novel method that provides approximate answers to OLAP queries. Our method is based on building a compressed (approximate) data cube by a clustering technique and using this compressed data cube to provide answers to queries directly, so it improves the performance of the queries. We also provide the algorithm of the OLAP queries and the confidence intervals of query results. An extensive experimental study with the OLAP council benchmark shows the effectiveness and scalability of our cluster-based approach compared to sampling.