Approximate encoding for direct access and query processing over compressed bitmaps

  • Authors:
  • Tan Apaydin;Guadalupe Canahuate;Hakan Ferhatosmanoglu;Ali Saman Tosun

  • Affiliations:
  • The Ohio State University;The Ohio State University;The Ohio State University;University of Texas at San Antonio

  • Venue:
  • VLDB '06 Proceedings of the 32nd international conference on Very large data bases
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Bitmap indices have been widely and successfully used in scientific and commercial databases. Compression techniques based on run-length encoding are used to improve the storage performance. However, these techniques introduce significant overheads in query processing even when only a few rows are queried. We propose a new bitmap encoding scheme based on multiple hashing, where the bitmap is kept in a compressed form, and can be directly accessed without decompression. Any subset of rows and/or columns can be retrieved efficiently by reconstructing and processing only the necessary subset of the bitmap. The proposed scheme provides approximate results with a trade-off between the amount of space and the accuracy. False misses are guaranteed not to occur, and the false positive rate can be estimated and controlled. We show that query execution is significantly faster than WAH-compressed bitmaps, which have been previously shown to achieve the fastest query response times. The proposed scheme achieves accurate results (90%-100%) and improves the speed of query processing from 1 to 3 orders of magnitude compared to WAH.