Optimization of queries with user-defined predicates

Authors:
Surajit Chaudhuri;Kyuseok Shim
Affiliations:
Microsoft Research, Redmond, WA;Bell Lab., Murray Hill, NJ
Venue:
ACM Transactions on Database Systems (TODS)
Year:
1999

Citing 19
Cited 40

Join processing in database systems with large main memories

ACM Transactions on Database Systems (TODS)
The EXODUS optimizer generator

SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
Query optimization in a memory-resident domain relational calculus database system

ACM Transactions on Database Systems (TODS)
Introduction to algorithms

Introduction to algorithms
Towards an open architecture for LDL

VLDB '89 Proceedings of the 15th international conference on Very large data bases
Randomized algorithms for optimizing large join queries

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Query optimization for parallel execution

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Predicate migration: optimizing queries with expensive predicates

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Practical predicate placement

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Advanced query optimization techniques for relational database systems

Advanced query optimization techniques for relational database systems
Optimization and execution techniques for queries with expensive methods

Optimization and execution techniques for queries with expensive methods
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Optimization of Nonrecursive Queries

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Implementing an Interpreter for Functional Rules in a Query Optimizer

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Optimizing Boolean Expressions in Object-Bases

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Query Optimization in the Presence of Foreign Functions

VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Including Group-By in Query Optimization

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Optimization of Queries with User-defined Predicates

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
The Volcano Optimizer Generator: Extensibility and Efficient Search

Proceedings of the Ninth International Conference on Data Engineering

Query optimization in compressed database systems

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Specifying Mining Algorithms with Iterative User-Defined Aggregates: A Case Study

PKDD '01 Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery
Efficient Querying of Distributed Resources in Mediator Systems

On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002
Factorizing complex predicates in queries to exploit indexes

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Estimating compilation time of a query optimizer

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Adaptive ordering of pipelined stream filters

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Optimizing Top-k Selection Queries over Multimedia Repositories

IEEE Transactions on Knowledge and Data Engineering
Specifying Mining Algorithms with Iterative User-Defined Aggregates

IEEE Transactions on Knowledge and Data Engineering
Energy management schemes for memory-resident database systems

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Operator scheduling in data stream systems

The VLDB Journal — The International Journal on Very Large Data Bases
Exploiting Correlated Attributes in Acquisitional Query Processing

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On the Optimal Ordering of Maps and Selections under Factorization

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Operator placement for in-network stream query processing

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Integrating databases and workflow systems

ACM SIGMOD Record
Self-tuning cost modeling of user-defined functions in an object-relational DBMS

ACM Transactions on Database Systems (TODS)
To search or to crawl?: towards a query optimizer for text-centric tasks

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Query optimization over web services

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient processing of complex similarity queries in RDBMS through query rewriting

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
The use of Hints in SQL-Nested query optimization

Information Sciences: an International Journal
Optimization of multi-version expensive predicates

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Optimization of continuous queries with shared expensive filters

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Scalable event matching for overlapping subscriptions in pub/sub systems

Proceedings of the 2007 inaugural international conference on Distributed event-based systems
Structural function inlining technique for structurally recursive XML queries

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Towards a query optimizer for text-centric tasks

ACM Transactions on Database Systems (TODS)
Declarative information extraction using datalog with embedded extraction predicates

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Near-optimal algorithms for shared filter evaluation in data stream systems

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
RDF-3X: a RISC-style engine for RDF

Proceedings of the VLDB Endowment
Architecture of a Database System

Foundations and Trends in Databases
Localization of distributed data in a CORBA-based environment

WSEAS Transactions on Information Science and Applications
Mapping filtering streaming applications with communication costs

Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
SQL/MapReduce: a practical approach to self-describing, polymorphic, and parallelizable user-defined functions

Proceedings of the VLDB Endowment
The RDF-3X engine for scalable management of RDF data

The VLDB Journal — The International Journal on Very Large Data Bases
MRShare: sharing across multiple queries in MapReduce

Proceedings of the VLDB Endowment
Multimedia selection operation placement

Multimedia Tools and Applications
The pipelined set cover problem

ICDT'05 Proceedings of the 10th international conference on Database Theory
On the optimal ordering of maps, selections, and joins under factorization

BNCOD'06 Proceedings of the 23rd British National Conference on Databases, conference on Flexible and Efficient Information Handling
ParaLite: Supporting Collective Queries in Database System to Parallelize User-Defined Executable

CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Opening the black boxes in data flow optimization

Proceedings of the VLDB Endowment
Mosaik: scalable smart grid scenario specification

Proceedings of the Winter Simulation Conference
Efficiently adapting graphical models for selectivity estimation

The VLDB Journal — The International Journal on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Relational databases provide the ability to store user-defined functions and predicates which can be invoked in SQL queries. When evaluation of a user-defined predicate is relatively expensive, the traditional method of evaluating predicates as early as possible is no longer a sound heuristic. There are two previous approaches for optimizing such queries. However, neither is able to guarantee the optimal plan over the desired execution space. We present efficient techniques that are able to guarantee the choice of an optimal plan over the desired execution space. The optimization algorithm with complete rank-ordering improves upon the naive optimization algorithm by exploiting the nature of the cost formulas for join methods and is polynomial in the number of user-defined predicates (for a given number of relations.) We also propose pruning rules that significantly reduce the cost of searching the execution space for both the naive algorithm as well as for the optimization algorithm with complete rank-ordering, without compromising optimality. We also propose a conservative local heuristic that is simpler and has low optimization overhead. Although it is not always guaranteed to find the optimal plans, it produces close to optimal plans in most cases. We discuss how, depending on application requirements, to determine the algorithm of choice. It should be emphasized that our optimization algorithms handle user-defined selections as well as user-defined join predicates uniformly. We present complexity analysis and experimental comparison of the algorithms.