Progressive evaluation of nested aggregate queries

Authors:
Kian-Lee Tan;Cheng Hian Goh;Beng Chin Ooi
Affiliations:
Department of Computer Science, National University of Singapore, 3 Science Drive 2, Singapore 117543/ E-mail: tankl@comp.nus.edu.sg;Department of Computer Science, National University of Singapore, 3 Science Drive 2, Singapore 117543/ E-mail: tankl@comp.nus.edu.sg;Department of Computer Science, National University of Singapore, 3 Science Drive 2, Singapore 117543/ E-mail: tankl@comp.nus.edu.sg
Venue:
The VLDB Journal — The International Journal on Very Large Data Bases
Year:
2000

Citing 17
Cited 1

Optimization of nested SQL queries revisited

SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
Processing aggregate relational queries with hard time constraints

SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Efficient sampling strategies for relational database operations

ICDT Selected papers of the 4th international conference on Database theory
Selectivity and cost estimation for joins based on random sampling

Journal of Computer and System Sciences
Processing queries for first-few answers

CIKM '96 Proceedings of the fifth international conference on Information and knowledge management
Online aggregation

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Ripple joins for online aggregation

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Statistical estimators for relational algebra expressions

Proceedings of the seventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
On optimizing an SQL-like nested query

ACM Transactions on Database Systems (TODS)
Dataflow query execution in a parallel main-memory environment

PDIS '91 Proceedings of the first international conference on Parallel and distributed information systems
A Truncating Hash Algorithm for Processing Band-Join Queries

Proceedings of the Ninth International Conference on Data Engineering
Optimization of Nested Queries in a Distributed Relational Database

VLDB '84 Proceedings of the 10th International Conference on Very Large Data Bases
Online Feedback for Nested Aggregate Queries with Multi-Threading

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Of Nests and Trees: A Unified Approach to Processing Queries That Contain Nested Subqueries, Aggregates, and Quantifiers

VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
An Evaluation of Non-Equijoin Algorithms

VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Large-Sample and Deterministic Confidence Intervals for Online Aggregation

SSDBM '97 Proceedings of the Ninth International Conference on Scientific and Statistical Database Management
On Getting Some Answers Quickly, and Perhaps More Later

ICDE '99 Proceedings of the 15th International Conference on Data Engineering

Progressive Evaluation of XML Queries for Online Aggregation and Progress Indicator

DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many decision-making scenarios, decision makers require rapid feedback to their queries, which typically involve aggregates. The traditional blocking execution model can no longer meet the demands of these users. One promising approach in the literature, called online aggregation, evaluates an aggregation query progressively as follows: as soon as certain data have been evaluated, approximate answers are produced with their respective running confidence intervals; as more data are examined, the answers and their corresponding running confidence intervals are refined. In this paper, we extend this approach to handle nested queries with aggregates (i.e., at least one inner query block is an aggregate query) by providing users with (approximate) answers progressively as the inner aggregation query blocks are evaluated. We address the new issues pose by nested queries. In particular, the answer space begins with a superset of the final answers and is refined as the aggregates from the inner query blocks are refined. For the intermediary answers to be meaningful, they have to be interpreted with the aggregates from the inner queries. We also propose a multi-threaded model in evaluating such queries: each query block is assigned to a thread, and the threads can be evaluated concurrently and independently. The time slice across the threads is nondeterministic in the sense that the user controls the relative rate at which these subqueries are being evaluated. For enumerative nested queries, we propose a priority-based evaluation strategy to present answers that are certainly in the final answer space first, before presenting those whose validity may be affected as the inner query aggregates are refined. We implemented a prototype system using Java and evaluated our system. Results for nested queries with a level and multiple levels of nesting are reported. Our results show the effectiveness of the proposed mechanisms in providing progressive feedback that reduces the initial waiting time of users significantly without sacrificing the quality of the answers.