Query execution techniques for caching expensive methods

  • Authors:
  • Joseph M. Hellerstein;Jeffrey F. Naughton

  • Affiliations:
  • University of California, Berkeley, EECS, Computer Science Division, 387 Soda Hall #1776, Berkeley CA and University of Wisconsin, Department of Computer Sciences, 1210 W. Dayton St., Madison, WI;University of Wisconsin, Department of Computer Sciences, 1210 W. Dayton St., Madison, WI

  • Venue:
  • SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

Object-Relational and Object-Oriented DBMSs allow users to invoke time-consuming ("expensive") methods in their queries. When queries containing these expensive methods are run on data with duplicate values, time is wasted redundantly computing methods on the same value. This problem has been studied in the context of programming languages, where "memoization" is the standard solution. In the database literature, sorting has been proposed to deal with this problem. We compare these approaches along with a third solution, a variant of unary hybrid hashing which we call Hybrid Cache. We demonstrate that Hybrid Cache always dominates memoization, and significantly outperforms sorting in many instances. This provides new insights into the tradeoff between hashing and sorting for unary operations. Additionally, our Hybrid Cache algorithm includes some new optimization for unary hybrid hashing, which can be used for other applications such as grouping and duplicate elimination. We conclude with a discussion of techniques for caching multiple expensive methods in a single query, and raise some new optimization problems in choosing caching techniques.