Compressed dictionaries: space measures, data sets, and experiments

  • Authors:
  • Ankur Gupta;Wing-Kai Hon;Rahul Shah;Jeffrey Scott Vitter

  • Affiliations:
  • Department of Computer Sciences, Purdue University, West Lafayette, IN;Department of Computer Sciences, Purdue University, West Lafayette, IN;Department of Computer Sciences, Purdue University, West Lafayette, IN;Department of Computer Sciences, Purdue University, West Lafayette, IN

  • Venue:
  • WEA'06 Proceedings of the 5th international conference on Experimental Algorithms
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present an experimental study of the space-time tradeoffs for the dictionary problem, where we design a data structure to represent set data, which consist of a subset S of n items out of a universe U = {0, 1,...,u – 1} supporting various queries on S. Our primary goal is to reduce the space required for such a dictionary data structure. Many compression schemes have been developed for dictionaries, which fall generally in the categories of combinatorial encodings and data-aware methods and still support queries efficiently. We show that for many (real-world) datasets, data-aware methods lead to a worthwhile compression over combinatorial methods. Additionally, we design a new data-aware building block structure called BSGAP that presents improvements over other data-aware methods.