Optimizing candidate check costs for bitmap indices

  • Authors:
  • Doron Rotem;Kurt Stockinger;Kesheng Wu

  • Affiliations:
  • Lawrence Berkeley National Laboratory, University of California;Lawrence Berkeley National Laboratory, University of California;Lawrence Berkeley National Laboratory, University of California

  • Venue:
  • Proceedings of the 14th ACM international conference on Information and knowledge management
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a new strategy for optimizing the placement of bin boundaries to minimize the cost of query evaluation using bitmap indices with binning. For attributes with a large number of distinct values, often the most efficient index scheme is a bitmap index with binning. However, this type of index may not be able to fully resolve some user queries. To fully resolve these queries, one has to access parts of the original data to check whether certain candidate records actually satisfy the specified conditions. We call this procedure the candidate check, which usually dominates the total query processing time. Given a set of user queries, we seek to minimize the total time required to an-swer the queries by optimally placing the bin boundaries. We show that our dynamic programming based algorithm can efficiently determine the bin boundaries. We verify our analysis with some real user queries from the Sloan Digital Sky Survey. For queries that require significant amount of time to perform candidate check, using our optimal bin boundaries reduces the candidate check time by a factor of 2 and the total query processing time by 40%.