Query execution techniques for caching expensive methods

Authors:
Joseph M. Hellerstein;Jeffrey F. Naughton
Affiliations:
University of California, Berkeley, EECS, Computer Science Division, 387 Soda Hall #1776, Berkeley CA and University of Wisconsin, Department of Computer Sciences, 1210 W. Dayton St., Madison, WI;University of Wisconsin, Department of Computer Sciences, 1210 W. Dayton St., Madison, WI
Venue:
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Year:
1996

Citing 20
Cited 44

Multiple-query optimization

ACM Transactions on Database Systems (TODS)
Towards an open architecture for LDL

VLDB '89 Proceedings of the 15th international conference on Very large data bases
Magic is relevant

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Query evaluation techniques for large databases

ACM Computing Surveys (CSUR)
Predicate migration: optimizing queries with expensive predicates

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Practical predicate placement

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Optimizing disjunctive queries with expensive predicates

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Cost-based optimization for magic: algebra and implementation

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Statistical estimators for relational algebra expressions

Proceedings of the seventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Indexing in an object-oriented DBMS

OODS '86 Proceedings on the 1986 international workshop on Object-oriented database systems
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Implementation techniques for main memory database systems

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Complex Query Decorrelation

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Hashing Methods and Relational Algebra Operations

VLDB '84 Proceedings of the 10th International Conference on Very Large Data Bases
Hash-Partitioned Join Method Using Dynamic Destaging Strategy

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Extended User-Defined Indexing with Application to Textual Databases

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Bypassing Joins in Disjunctive Queries

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Sampling-Based Estimation of the Number of Distinct Values of an Attribute

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Optimization of Queries Including ADT Functions

Proceedings of the Second International Symposium on Database Systems for Advanced Applications
Predicate Migration: Optimizing Queries with

Predicate Migration: Optimizing Queries with

Online aggregation

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Reusing invariants: a new strategy for correlated queries

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
On parallel processing of aggregate and scalar functions in object-relational DBMS

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Optimization techniques for queries with expensive methods

ACM Transactions on Database Systems (TODS)
Ripple joins for online aggregation

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Query optimization in the presence of limited access patterns

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Client-site query extensions

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
WSQ/DSQ: a practical approach for combined querying of databases and the Web

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient execution of multiple query workloads in data analysis applications

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Informix under CONTROL: Online Query Processing

Data Mining and Knowledge Discovery
Optimization and Evaluation of Disjunctive Queries

IEEE Transactions on Knowledge and Data Engineering
Optimizing Queries with Foreign Functions in a Distributed Environment

IEEE Transactions on Knowledge and Data Engineering
A Graphical Query Language: VISUAL and Its Query Processing

IEEE Transactions on Knowledge and Data Engineering
Hash Joins and Hash Teams in Microsoft SQL Server

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Optimization of Run-time Management of Data Intensive Web-sites

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
User-Defined Table Operators: Enhancing Extensibility for ORDBMS

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Online Dynamic Reordering for Interactive Data Processing

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Hyperqueries: Dynamic Distributed Query Processing on the Internet

Proceedings of the 27th International Conference on Very Large Data Bases
The Complexity of Transformation-Based Join Enumeration

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Speeding Up Navigational Requests in a Parallel Object Database System

Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Efficient Querying of Distributed Resources in Mediator Systems

On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002
Online dynamic reordering

The VLDB Journal — The International Journal on Very Large Data Bases
A graph-theoretic model for optimizing queries involving methods

The VLDB Journal — The International Journal on Very Large Data Bases
Parallelizing User-Defined Functions in Distributed Object-Relational DBMS

IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
Exposing undergraduate students to database system internals

ACM SIGMOD Record
Building Scalable Electronic Market Places Using HyperQuery-Based Distributed Query Processing

World Wide Web
Adaptive Caching for Continuous Queries

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On the Optimal Ordering of Maps and Selections under Factorization

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Full-Fledged Algebraic XPath Processing in Natix

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Predicate result range caching for continuous queries

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Adaptive execution of variable-accuracy functions

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Resource control for java database extensions

COOTS'99 Proceedings of the 5th conference on USENIX Conference on Object-Oriented Technologies & Systems - Volume 5
Form-based proxy caching for database-backed web sites: keywords and functions

The VLDB Journal — The International Journal on Very Large Data Bases
SQL/MapReduce: a practical approach to self-describing, polymorphic, and parallelizable user-defined functions

Proceedings of the VLDB Endowment
Caching and Materialization for Web Databases

Foundations and Trends in Databases
New concepts for parallel object-relational query processing

New concepts for parallel object-relational query processing
Site-autonomous distributed semantic caching

Proceedings of the 2011 ACM Symposium on Applied Computing
CrowdDB: answering queries with crowdsourcing

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
A platform for scalable one-pass analytics using MapReduce

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
On the optimal ordering of maps, selections, and joins under factorization

BNCOD'06 Proceedings of the 23rd British National Conference on Databases, conference on Flexible and Efficient Information Handling
Exploring caching for efficient collection operations

ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
Triggers and Monitoring in Intelligent Personal Health Record

Journal of Medical Systems
SCALLA: A Platform for Scalable One-Pass Analytics Using MapReduce

ACM Transactions on Database Systems (TODS)
Exploring optimization and caching for efficient collection operations

Automated Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Object-Relational and Object-Oriented DBMSs allow users to invoke time-consuming ("expensive") methods in their queries. When queries containing these expensive methods are run on data with duplicate values, time is wasted redundantly computing methods on the same value. This problem has been studied in the context of programming languages, where "memoization" is the standard solution. In the database literature, sorting has been proposed to deal with this problem. We compare these approaches along with a third solution, a variant of unary hybrid hashing which we call Hybrid Cache. We demonstrate that Hybrid Cache always dominates memoization, and significantly outperforms sorting in many instances. This provides new insights into the tradeoff between hashing and sorting for unary operations. Additionally, our Hybrid Cache algorithm includes some new optimization for unary hybrid hashing, which can be used for other applications such as grouping and duplicate elimination. We conclude with a discussion of techniques for caching multiple expensive methods in a single query, and raise some new optimization problems in choosing caching techniques.