Re-pair Achieves High-Order Entropy

  • Authors:
  • Gonzalo Navarro;Luís Russo

  • Affiliations:
  • -;-

  • Venue:
  • DCC '08 Proceedings of the Data Compression Conference
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Re-Pair is a dictionary-based compression method invented in 1999 by Larssonand Moffat. Although its practical performance has been established through experiments, the method has resisted all attempts of formal analysis. In thispaper we show that Re-Pair compresses a sequence T[1,n] over an alphabet ofsize $\sigma$ and k-th order entropy H_k, to at most 2nH_k+o(n\log\sigma)bits, for any k=o(log_sigma n).