Fast Approximate Answers to Aggregate Queries on a Data Cube

Authors:
Viswanath Poosala;Venkatesh Ganti
Affiliations:
-;-
Venue:
SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
Year:
1999

Citing 0
Cited 36

Approximating multi-dimensional aggregate range queries over real attributes

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
STHoles: a multidimensional workload-aware histogram

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A robust, optimization-based approach for approximate answering of aggregate queries

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Data-streams and histograms

STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
How to evaluate multiple range-sum queries progressively

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Continuous queries over data streams

ACM SIGMOD Record
ProPolyne: A Fast Wavelet-Based Algorithm for Progressive Evaluation of Polynomial Range-Sum Queries

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Aqua: A Fast Decision Support Systems Using Approximate Query Answers

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Histogram-Based Approximation of Set-Valued Query-Answers

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Approximate Query Processing Using Wavelets

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
ICICLES: Self-Tuning Samples for Approximate Query Answering

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports

Proceedings of the 27th International Conference on Very Large Data Bases
Value Range Queries on Earth Science Data via Histogram Clustering

TSDM '00 Proceedings of the First International Workshop on Temporal, Spatial, and Spatio-Temporal Data Mining-Revised Papers
Vmhist: Efficient Multidimensional Histograms with Improved Accuracy

DaWaK 2000 Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery
Approximate query processing using wavelets

The VLDB Journal — The International Journal on Very Large Data Bases
Managing and analyzing massive data sets with data cubes

Handbook of massive data sets
pCube: Update-Efficient Online Aggregation with Progressive Feedback and Error Bounds

SSDBM '00 Proceedings of the 12th International Conference on Scientific and Statistical Database Management
Extended wavelets for multiple measures

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Selectivity estimators for multidimensional range queries over real attributes

The VLDB Journal — The International Journal on Very Large Data Bases
Improving range-sum query evaluation on data cubes via polynomial approximation

Data & Knowledge Engineering
Pre-aggregation with probability distributions

DOLAP '06 Proceedings of the 9th ACM international workshop on Data warehousing and OLAP
Approximate range---sum query answering on data cubes with probabilistic guarantees

Journal of Intelligent Information Systems
Optimized stratified sampling for approximate query processing

ACM Transactions on Database Systems (TODS)
ROLAP implementations of the data cube

ACM Computing Surveys (CSUR)
XWAVE: optimal and approximate extended wavelets

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
DAWN: an efficient framework of DCT for data with error estimation

The VLDB Journal — The International Journal on Very Large Data Bases
Plot Query Processing with Wavelets

SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Multiple-Objective Compression of Data Cubes in Cooperative OLAP Environments

ADBIS '08 Proceedings of the 12th East European conference on Advances in Databases and Information Systems
Enabling OLAP in mobile environments via intelligent data cube compression techniques

Journal of Intelligent Information Systems
A top-down approach for compressing data cubes under the simultaneous evaluation of multiple hierarchical range queries

Journal of Intelligent Information Systems
Top-down compression of data cubes in the presence of simultaneous multiple hierarchical range queries

ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
Adaptive dimensionality reduction for recent-biased time series analysis

Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
A quad-tree based multiresolution approach for two-dimensional summary data

Information Systems
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches

Foundations and Trends in Databases
Exploiting data access for dynamic fragmentation in data warehouse

International Journal of Intelligent Information and Database Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern decision support systems require very quick (interactive) responses from the DBMS, but pose complex queries on large volumes of data. In this paper, we present a novel solution to this problem: we precompute concise histogram statistics on the data to answer the queries quickly, but approximately. Our hypothesis is that many decision support applications can tolerate small errors in query results in return for large reductions in response times.In particular, we propose the use of multiple histograms to approximate the data cube and answer aggregate queries approximately using this summarized data. We enhance histograms to estimate the quality of the approximate answers. We primarily explore the interaction among various histograms on the data cube in order to minimize the space needed when an upper bound on the errors is given. Our main contribution in this paper is an efficient technique for selecting a provably near-optimal set of histograms on the data cube. Extensive experiments show that our technique results in very accurate and concise statistics.