Fast Data Compression with Antidictionaries

  • Authors:
  • Michael Davidson;Lucian Ilie

  • Affiliations:
  • Department of Computer Science, University of Western Ontario, N6A 5B7, London, Ontario, Canada. ilie@csd.uwo.ca;(Correspd.) Department of Computer Science, University of Western Ontario, N6A 5B7, London, Ontario, Canada. ilie@csd.uwo.ca

  • Venue:
  • Fundamenta Informaticae - Contagious Creativity - In Honor of the 80th Birthday of Professor Solomon Marcus
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the data compression using antidictionaries and give algorithms for faster compression and decompression. While the original method of Crochemore et al. uses finite transducers with ε-moves, we (de)compress using ε-free transducers. This is provably faster, assuming data non-negligibly compressible, but we have to consider the overhead due to building the new ma-chines. In general, they can be quadratic in size compared to the ones allowing ε-moves; we prove this bound optimal as it is reached for de Bruijn words. However, in practice, the size of the ε-free machines turns out to be close to the size of the ones allowing ε-moves and therefore we can achieve significantly faster (de)compression. We show our results for the files in Calgary corpus.