Optimization of multiple-relation multiple-disjunct queries

Authors:
M. Muralikrishna;David J. DeWitt
Affiliations:
Computer Sciences Department, University of Wiscsonsin, Madison, WI;Computer Sciences Department, University of Wiscsonsin, Madison, WI
Venue:
Proceedings of the seventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Year:
1988

Citing 8
Cited 2

Query processing in a system for distributed databases (SDD-1)

ACM Transactions on Database Systems (TODS)
Query optimization in star computer networks

ACM Transactions on Database Systems (TODS)
Optimization of query evaluation algorithms

ACM Transactions on Database Systems (TODS)
Decomposition—a strategy for query processing

ACM Transactions on Database Systems (TODS)
Query Optimization in Database Systems

ACM Computing Surveys (CSUR)
Optimizing the performance of a relational algebra database interface

Communications of the ACM
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Optimal implementation of conjunctive queries in relational data bases

STOC '77 Proceedings of the ninth annual ACM symposium on Theory of computing

An optimization of disjunctive queries: union-pushdown

COMPSAC '97 Proceedings of the 21st International Computer Software and Applications Conference
Factorizing complex predicates in queries to exploit indexes

Proceedings of the 2003 ACM SIGMOD international conference on Management of data

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we discuss the optimization of multiple-relation multiple-disjunct queries in a relational database system. Since optimization techniques for conjunctive (single disjunct) queries in relational databases are well known [Smith75, Wong76, Selinger79, Yao79, Youssefi79], the natural way to evaluate a multiple-disjunct query was to execute each disjunct independently [Bernstein81, Kerschberg82] However, evaluating each disjunct independently may be very inefficient. In this paper, we develop methods that merge two or more disjuncts to form a term. The advantage of merging disjuncts to form terms lies in the fact that each term can be evaluated with a single scan of each relation that is present in the term. In addition, the number of times a join is performed will also be reduced when two or more disjuncts are merged. The criteria for merging a set of disjuncts will be presented. As we will see, the number of times each relation in the query is scanned will be equal to the number of terms. Thus, minimizing the number of terms will minimize the number of scans for each relation. We will formulate the problem of minimizing the number of scans as one of covering a merge graph by a minimum number of complete merge graphs which are a restricted class of Cartesian product graphs. In general, the problem of minimizing the number of scans is NP-complete. We present polynomial time algorithms for special classes of merge graphs. We also present a heuristic for general merge graphs.Throughout this paper, we will assume that no relations have any indices on them and that we are only concerned with reducing the number of scans for each relation present in the query. What about relations that have indices on them? It turns out that our performance metric of reducing the number of scans is beneficial even in the case that there are indices. In [Muralikrishna88] we demonstrate that when optimizing single-relation multiple-disjunct queries, the cost (measured in terms of disk accesses) may be reduced if all the disjuncts are optimized together rather than individually. Thus, our algorithm for minimizing the number of terms is also very beneficial in cases where indices exist