Evolutionary lossless compression with GP-ZIP*

  • Authors:
  • Ahmad Kattan;Riccardo Poli

  • Affiliations:
  • University of Essex, Colchester, United Kngdm;University of Essex, Colchester, United Kngdm

  • Venue:
  • Proceedings of the 10th annual conference on Genetic and evolutionary computation
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In recent research we proposed GP-zip, a system which uses evolution to find optimal ways to combine standard compression algorithms for the purpose of maximally losslessly compressing files and archives. The system divides files into blocks of predefined length. It then uses a linear, fixed-length representation where each primitive indicates what compression algorithm to use for a specific data block. GP-zip worked well with heterogonous data sets, providing significant improvements in compression ratio compared to some of the best standard compression algorithms. In this paper we propose a substantial improvement, called GP-zip*, which uses a new representation and intelligent crossover and mutation operators such that blocks of different sizes can be evolved. Like GP-zip, GP-zip* finds what the best compression technique to use for each block is. The compression algorithms available in the primitive set of GP-zip* are: Arithmetic coding (AC), Lempel-Ziv-Welch (LZW), Unbounded Prediction by Partial Matching (PPMD), Run Length Encoding (RLE), and Boolean Minimization. In addition, two transformation techniques are available: the Burrows-Wheeler Transformation (BWT) and Move to Front (MTF). Results show that GP-zip* provides improvements in compression ratio ranging from a fraction to several tens of percent over its predecessor.