Bitmap indexes for large scientific data sets: a case study

  • Authors:
  • Rishi Rakesh Sinha;Soumyadeb Mitra;Marianne Winslett

  • Affiliations:
  • University of Illinois, Urnana Champaign, Dept. of Computer Science, Urbana, IL;University of Illinois, Urnana Champaign, Dept. of Computer Science, Urbana, IL;University of Illinois, Urnana Champaign, Dept. of Computer Science, Urbana, IL

  • Venue:
  • IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The data used by today's scientific applications are often very high in dimensionality and staggering in size. These characteristics necessitate the use of a good multidimensional indexing strategy to provide efficient access to the data. Researchers have previously proposed the use of bitmap indexes for high-dimension scientific data as a way of overcoming the drawbacks of traditional multidimensional indexes such as R-trees and KD-trees, which are bulky and whose performance does not scale well as the number of dimensions increases. However, the techniques proposed in previous work on bitmap indexes are not sufficient to address all problems that arise in practice. In experiments with real datasets, we experienced problems with index size and query performance. To overcome these shortcomings, we propose the use of adaptive, multilevel, multiresolution bitmap indexes, and evaluate their performance in two scientific domains. Our preliminary experiments with a parallel query processor and index creator also show that it is very easy to parallelize a bitmap index.