Reusing invariants: a new strategy for correlated queries

Authors:
Jun Rao;Kenneth A. Ross
Affiliations:
Department of Computer Science, Columbia University;Department of Computer Science, Columbia University
Venue:
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Year:
1998

Citing 19
Cited 12

Optimization of nested SQL queries revisited

SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
Multiple-query optimization

ACM Transactions on Database Systems (TODS)
Measuring the complexity of join enumeration in query optimization

Proceedings of the sixteenth international conference on Very large databases
Extensible/rule based query rewrite optimization in Starburst

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Practical predicate placement

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Fundamental techniques for order optimization

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Query execution techniques for caching expensive methods

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Processing queries for first-few answers

CIKM '96 Proceedings of the fifth international conference on Information and knowledge management
Improved query performance with variant indexes

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Online aggregation

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Optimization of complex aggregate queries in relational databases

Optimization of complex aggregate queries in relational databases
On optimizing an SQL-like nested query

ACM Transactions on Database Systems (TODS)
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
On the Multiple-Query Optimization Problem

IEEE Transactions on Knowledge and Data Engineering
Sort vs. Hash Revisited

IEEE Transactions on Knowledge and Data Engineering
Complex Query Decorrelation

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Groupwise Processing of Relational Queries

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Of Nests and Trees: A Unified Approach to Processing Queries That Contain Nested Subqueries, Aggregates, and Quantifiers

VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
Eager Aggregation and Lazy Aggregation

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases

Efficient and extensible algorithms for multi query optimization

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Informix under CONTROL: Online Query Processing

Data Mining and Knowledge Discovery
Approximate Query Answering Using Data Warehouse Striping

Journal of Intelligent Information Systems - Special issue on data warehousing and knowledge discovery
Optimization of Run-time Management of Data Intensive Web-sites

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Optimization of Nested SQL Queries by Tableau Equivalence

DBPL '99 Revised Papers from the 7th International Workshop on Database Programming Languages: Research Issues in Structured and Semistructured Database Programming
WinMagic: subquery elimination using window aggregation

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Efficient exploitation of similar subexpressions for query processing

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
SQL query optimization through nested relational algebra

ACM Transactions on Database Systems (TODS)
Optimizing view queries in ROLEX to support navigable result trees

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Generic database cost models for hierarchical memory systems

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Exploiting maximal redundancy to optimize SQL queries

Knowledge and Information Systems
An architecture for recycling intermediates in a column-store

ACM Transactions on Database Systems (TODS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Correlated queries are very common and important in decision support systems. Traditional nested iteration evaluation methods for such queries can be very time consuming. When they apply, query rewriting techniques have been shown to be much more efficient. But query rewriting is not always possible. When query rewriting does not apply, can we do something better than the traditional nested iteration methods? In this paper, we propose a new invariant technique to evaluate correlated queries efficiently. The basic idea is to recognize the part of the subquery that is not related to the outer references and cache the result of that part after its first execution. Later, we can reuse the result and combine it with the result of the rest of the subquery that is changing for each iteration. Our technique applies to arbitrary correlated subqueries.This paper introduces algorithms to recognize the invariant part of a data flow tree, and to restructure the evaluation plan to reuse the stored intermediate result. We also propose an efficient method to teach an existing join optimizer to understand the invariant feature and thus allow it to be able to generate better join plans in the new context. Some other related optimization techniques are also discussed. The proposed techniques were implemented within three months on an existing real commercial database system.We also experimentally evaluate our proposed technique. Our evaluation indicates that, when query rewriting is not possible, the invariant technique is significantly better than the traditional nested iteration method. Even when query rewriting applies, the invariant technique is sometimes better than the query rewriting technique. Our conclusion is that the invariant technique should be considered as one of the alternatives in evaluating correlated queries since it fills the gap left by rewriting techniques.