Implementing the Context Tree Weighting Method for Text Compression

  • Authors:
  • Kunihiko Sadakane;Takumi Okazaki;Hiroshi Imai

  • Affiliations:
  • -;-;-

  • Venue:
  • DCC '00 Proceedings of the Conference on Data Compression
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Context tree weighting method is a universal compression algorithm for FSMX sources. Though we expect that it will have good compression ratio in practice, it is difficult to implement it and in many cases the implementation is only for estimating compression ratio. Though Willems and Tjalkens showed practical implementation using not block probabilities but conditional probabilities, it is used for only binary alphabet sequences. We extend the method for multi-alphabet sequences and show a simple implementation using PPM techniques. We also propose a method to optimize a parameter of the context tree weighting for binary alphabet case. Experimental results on texts and DNA sequences show that the performance of PPM can be improved by combining the context tree weighting and that DNA sequences can be compressed in less than 2.0 bpc.