DSQoS-distributed architecture providing QoS in summary warehouses

Authors:
João Pedro Costa;Pedro Furtado
Affiliations:
Instituto Superior de Engenharia de Coimbra, Coimbra, Portugal;Universidade de Coimbra, Coimbra, Portugal
Venue:
DOLAP '03 Proceedings of the 6th ACM international workshop on Data warehousing and OLAP
Year:
2003

Citing 10
Cited 0

Random sampling with a reservoir

ACM Transactions on Mathematical Software (TOMS)
Online aggregation

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
New sampling-based summary statistics for improving approximate query answers

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Approximate computation of multidimensional aggregates of sparse data using wavelets

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Join synopses for approximate query answering

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Congressional samples for approximate answering of group-by queries

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Time-Interval Sampling for Improved Estimations in Data Warehouses

DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
DWS-AQA: A Cost Effective Approach for Very Large Data Warehouses

IDEAS '02 Proceedings of the 2002 International Symposium on Database Engineering & Applications
Large-Sample and Deterministic Confidence Intervals for Online Aggregation

SSDBM '97 Proceedings of the Ninth International Conference on Scientific and Statistical Database Management
The BofS Solution to Limitations of Approximate Summaries

DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data warehouses (DW) that store enormous quantities of data put a major challenge in what concerns performance and scalability, as users request instant answers to their queries. Traditional solutions rely on very expensive architectures and structures for speedup and scale-up. The Summary Warehouse (SW) is an inexpensive solution that has the potential to deliver very fast approximate answers to aggregate queries using only general-purpose sampling summaries.Although summaries are expected to be extremely fast, some analysis requires larger summaries to estimate individual group results, compromising the speedup advantage. This is the accuracy/speedup (A/S) tradeoff.In this paper we propose the "Distributed Set-of-Summaries for Quality of Service" (DSQoS) that solves the A/S issue by optimizing the accuracy and response time for each query pattern in order to guarantee a desired Quality of Service (QoS). This QoS is defined in terms of response time and accuracy bounds. The strategy determines the required summary size to guarantee the accuracy targets and then dynamically select a set of summaries, distributed in various nodes, which can ensure the QoS constraints (time and accuracy). The strategy presents enormous possibilities since each node can contain summaries with different sizes, depending on the node characteristics, and can dynamically be added and removed from the system.We discuss the design of the approach and the strategies used to process queries. In the experimental section we show how the approach is able to deliver almost instant and accurate answers without employing expensive architectures, which would be impossible using other strategies.