Grammar precompression speeds up burrows---wheeler compression

  • Authors:
  • Juha Kärkkäinen;Pekka Mikkola;Dominik Kempa

  • Affiliations:
  • Department of Computer Science, University of Helsinki, Finland;Department of Computer Science, University of Helsinki, Finland;Department of Computer Science, University of Helsinki, Finland

  • Venue:
  • SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Text compression algorithms based on the Burrows---Wheeler transform (BWT) typically achieve a good compression ratio but are slow compared to Lempel---Ziv type compression algorithms. The main culprit is the time needed to compute the BWT during compression and its inverse during decompression. We propose to speed up BWT-based compression by performing a grammar-based precompression before the transform. The idea is to reduce the amount of data that BWT and its inverse have to process. We have developed a very fast grammar precompressor using pair replacement. Experiments show a substantial speed up in practice without a significant effect on compression ratio.