Predicate migration: optimizing queries with expensive predicates

Authors:
Joseph M. Hellerstein;Michael Stonebraker
Affiliations:
Computer Sciences Department, University of Wisconsin, Madison, WI and Computer Science Division, EECS Department, University of California Berkeley, CA;Computer Science Division, EECS Department, University of California Berkeley, CA
Venue:
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Year:
1993

Citing 23
Cited 96

On the optimal nesting order for computing N-relational joins

ACM Transactions on Database Systems (TODS)
The design of POSTGRES

SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
Development and implementation of an object-oriented DBMS

Research directions in object-oriented programming
Towards an open architecture for LDL

VLDB '89 Proceedings of the 15th international conference on Very large data bases
Managing persistent objects in a multi-level store

SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Extensible/rule based query rewrite optimization in Starburst

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Efficient sampling strategies for relational database operations

ICDT Selected papers of the 4th international conference on Database theory
The SEQUOIA 2000 storage benchmark

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Statistical estimators for relational algebra expressions

Proceedings of the seventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Indexing in an object-oriented DBMS

OODS '86 Proceedings on the 1986 international workshop on Object-oriented database systems
An optimal evaluation of Boolean expressions in an online query system

Communications of the ACM
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
The Iris Architecture and Implementation

IEEE Transactions on Knowledge and Data Engineering
The Story of O2

IEEE Transactions on Knowledge and Data Engineering
Starburst Mid-Flight: As the Dust Clears

IEEE Transactions on Knowledge and Data Engineering
Optimization of Nested Queries in a Distributed Relational Database

VLDB '84 Proceedings of the 10th International Conference on Very Large Data Bases
Optimization of Nonrecursive Queries

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Of Nests and Trees: A Unified Approach to Processing Queries That Contain Nested Subqueries, Aggregates, and Quantifiers

VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
The POSTGRES Data Model

VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
A Performance Study of Query Optimization Algorithms on a Database System Supporting Procedures

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Extended User-Defined Indexing with Application to Textual Databases

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
To Support Global Change Research

To Support Global Change Research
Predicate Migration: Optimizing Queries with

Predicate Migration: Optimizing Queries with

Using the co-existence approach to achieve combined functionality of object-oriented and relational systems

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Practical predicate placement

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Optimizing disjunctive queries with expensive predicates

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Join queries with external text sources: execution and optimization techniques

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Rapid bushy join-order optimization with Cartesian products

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Optimizing queries over multimedia repositories

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Query execution techniques for caching expensive methods

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
An overview of query optimization in relational systems

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
On parallel processing of aggregate and scalar functions in object-relational DBMS

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Optimization techniques for queries with expensive methods

ACM Transactions on Database Systems (TODS)
Query optimization in the presence of limited access patterns

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Client-site query extensions

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Optimization of queries with user-defined predicates

ACM Transactions on Database Systems (TODS)
MOCHA: a self-extensible database middleware system for distributed data sources

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Iterative dynamic programming: a new class of query optimization algorithms

ACM Transactions on Database Systems (TODS)
Query optimization in compressed database systems

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Conjunctive selection conditions in main memory

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Similarity-based algebra for multimedia database systems

ADC '01 Proceedings of the 12th Australasian database conference
Rewriting general conjunctive queries using views

ADC '02 Proceedings of the 13th Australasian database conference - Volume 5
Minimal probing: supporting expensive predicates for top-k queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
The CCUBE Constraint Object-Oriented Database System

Constraints
Sequoia 2000: A Reflection on the First Three Years

IEEE Computational Science & Engineering
Spatial Databases-Accomplishments and Research Needs

IEEE Transactions on Knowledge and Data Engineering
Optimization and Evaluation of Disjunctive Queries

IEEE Transactions on Knowledge and Data Engineering
Optimizing Queries with Foreign Functions in a Distributed Environment

IEEE Transactions on Knowledge and Data Engineering
A Graphical Query Language: VISUAL and Its Query Processing

IEEE Transactions on Knowledge and Data Engineering
Integrating SQL Databases with Content-Specific Search Engines

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Plan-Per-Tuple Optimization Solution - Parallel Execution of Expensive User-Defined Functions

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Filtering with Approximate Predicates

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Selectivity Estimation in Extensible Databases - A Neural Network Approach

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Cost Models DO Matter: Providing Cost Information for Diverse Data Sources in a Federated System

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
High Level Indexing of User-Defined Types

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
User-Defined Table Operators: Enhancing Extensibility for ORDBMS

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Query Optimization in the Presence of Foreign Functions

VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Query Optimization by Predicate Move-Around

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Generalized Search Trees for Database Systems

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Bypassing Joins in Disjunctive Queries

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Sampling-Based Estimation of the Number of Distinct Values of an Attribute

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Processing Object-Oriented Queries with Invertible Late Bound Functions

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Optimization of Queries with User-defined Predicates

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Coalescing in Temporal Databases

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
EROC: A Toolkit for Building NEATO Query Optimizers

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Efficient Querying of Distributed Resources in Mediator Systems

On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002
The BORD Benchmark for Object-Relational Databases

DEXA '00 Proceedings of the 11th International Conference on Database and Expert Systems Applications
Randomized Approximation Algorithms for Query Optimization Problems on Two Processors

ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
A graph-theoretic model for optimizing queries involving methods

The VLDB Journal — The International Journal on Very Large Data Bases
Query processing techniques for arrays

The VLDB Journal — The International Journal on Very Large Data Bases
Data warehousing

Handbook of massive data sets
Optimizing Selections over Datacubes

SSDBM '00 Proceedings of the 12th International Conference on Scientific and Statistical Database Management
Parallelizing User-Defined Functions in Distributed Object-Relational DBMS

IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
Spreadsheets in RDBMS for OLAP

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Factorizing complex predicates in queries to exploit indexes

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Cost-driven vertical class partitioning for methods in object oriented databases

The VLDB Journal — The International Journal on Very Large Data Bases
Selection conditions in main memory

ACM Transactions on Database Systems (TODS)
Evaluating top-k queries over web-accessible databases

ACM Transactions on Database Systems (TODS)
Data densification in a relational database system

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Extracting predicates from mining models for efficient query evaluation

ACM Transactions on Database Systems (TODS)
DiscoveryLink: a system for integrated access to life sciences data sources

IBM Systems Journal - Deep computing for the life sciences
Operator scheduling in data stream systems

The VLDB Journal — The International Journal on Very Large Data Bases
On the Optimal Ordering of Maps and Selections under Factorization

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Learning with attribute costs

Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
Operator placement for in-network stream query processing

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
RankSQL: query algebra and optimization for relational top-k queries

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Predicate result range caching for continuous queries

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Self-tuning cost modeling of user-defined functions in an object-relational DBMS

ACM Transactions on Database Systems (TODS)
SEQUOIA 2000: a reflection of the first three years

SSDBM'1994 Proceedings of the 7th international conference on Scientific and Statistical Database Management
Query optimization over web services

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Adaptive execution of variable-accuracy functions

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Cost-based query transformation in Oracle

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Optimizing top-k queries for middleware access: A unified cost-based approach

ACM Transactions on Database Systems (TODS)
Probe Minimization by Schedule Optimization: Supporting Top-K Queries with Expensive Predicates

IEEE Transactions on Knowledge and Data Engineering
Optimization of continuous queries with shared expensive filters

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Scalable event matching for overlapping subscriptions in pub/sub systems

Proceedings of the 2007 inaugural international conference on Distributed event-based systems
Resource control for java database extensions

COOTS'99 Proceedings of the 5th conference on USENIX Conference on Object-Oriented Technologies & Systems - Volume 5
A Methodology for Schema Integration Using Concept Hierarchies

Integrated Computer-Aided Engineering
Near-optimal algorithms for shared filter evaluation in data stream systems

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A generic flow algorithm for shared filter ordering problems

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Automaton in or out: run-time plan optimization for XML stream processing

SSPS '08 Proceedings of the 2nd international workshop on Scalable stream processing system
Efficient Implementation Techniques for Topological Predicates on Complex Spatial Objects

Geoinformatica
Sequencing unreliable jobs on parallel machines

Journal of Scheduling
Adaptive workload allocation in query processing in autonomous heterogeneous environments

Distributed and Parallel Databases
Topological feature vectors for exploring topological relationships

International Journal of Geographical Information Science
Mapping filtering streaming applications with communication costs

Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
SQL/MapReduce: a practical approach to self-describing, polymorphic, and parallelizable user-defined functions

Proceedings of the VLDB Endowment
New concepts for parallel object-relational query processing

New concepts for parallel object-relational query processing
Preference query evaluation over expensive attributes

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Multimedia selection operation placement

Multimedia Tools and Applications
Multi-attribute optimization in service selection

World Wide Web
On the optimal ordering of maps, selections, and joins under factorization

BNCOD'06 Proceedings of the 23rd British National Conference on Databases, conference on Flexible and Efficient Information Handling
ParaLite: Supporting Collective Queries in Database System to Parallelize User-Defined Executable

CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
REX: recursive, delta-based data-centric computation

Proceedings of the VLDB Endowment
Triggers and Monitoring in Intelligent Personal Health Record

Journal of Medical Systems
Optimizing queries with expensive video predicates in cloud environment

Concurrency and Computation: Practice & Experience
Deco: declarative crowdsourcing

Proceedings of the 21st ACM international conference on Information and knowledge management
Adaptive data acquisition strategies for energy-efficient, smartphone-based, continuous processing of sensor streams

Distributed and Parallel Databases
The list scheduling algorithm for scheduling unreliable jobs on two parallel machines

Discrete Applied Mathematics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The traditional focus of relational query optimization schemes has been on the choice of join methods and join orders. Restrictions have typically been handled in query optimizers by “predicate pushdown” rules, which apply restrictions in some random order before as many joins as possible. These rules work under the assumption that restriction is essentially a zero-time operation. However, today's extensible and object-oriented database systems allow users to define time-consuming functions, which may be used in a query's restriction and join predicates. Furthermore, SQL has long supported subquery predicates, which may be arbitrarily time-consuming to check. Thus restrictions should not be considered zero-time operations, and the model of query optimization must be enhanced.In this paper we develop a theory for moving expensive predicates in a query plan so that the total cost of the plan — including the costs of both joins and restrictions — is minimal. We present an algorithm to implement the theory, as well as results of our implementation in POSTGRES. Our experience with the newly enhanced POSTGRES query optimizer demonstrates that correctly optimizing queries with expensive predicates often produces plans that are orders of magnitude faster than plans generated by a traditional query optimizer. The additional complexity of considering expensive predicates during optimization is found to be manageably small.