PPMexe: PPM for Compressing Software

  • Authors:
  • Milenko Drinic;Darko Kirovski

  • Affiliations:
  • -;-

  • Venue:
  • DCC '02 Proceedings of the Data Compression Conference
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the emergence of software delivery platforms such as Microsoft's .NET, code compression has become one of the core enabling technologies strongly affecting system performance. In this paper, we present PPMexe - a set of compression mechanisms for executables that explores their syntax and semantics to achieve superior compression rates. The fundament of PPMexe is the generic paradigm of prediction by partial matching (PPM). We combine PPM with two pre-processing steps: instruction rescheduling to improve prediction rates and partitioning of a program binary into streams with high auto-correlation. We improve the traditional PPM algorithm by using: an additional alphabet of frequent variable-length super-symbols extracted from the input stream of fixed-length symbols and a low-overhead mechanism that enables decompression starting from an arbitrary instruction of the executable, a feature pivotal for run-time software delivery. PPMexe was implemented for x86 binaries and tested on several large Microsoft applications. Binaries compressed using PPMexe were 16-23% smaller than files created using PPMD, the best available compressor.