Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Potential-driven statistical ordering of transformations
DAC '97 Proceedings of the 34th annual Design Automation Conference
Automatic inference of models for statistical code compression
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Compiler techniques for code compaction
ACM Transactions on Programming Languages and Systems (TOPLAS)
Split-stream dictionary program compression
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Random Access Decompression using Binary Arithmetic Coding
DCC '99 Proceedings of the Conference on Data Compression
Unbounded length contexts for PPM
DCC '95 Proceedings of the Conference on Data Compression
Revisiting dictionary-based compression: Research Articles
Software—Practice & Experience
On prediction using variable order Markov models
Journal of Artificial Intelligence Research
Content-dependent chunking for differential compression, the local maximum approach
Journal of Computer and System Sciences
Lossless compression for large scale cluster logs
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hi-index | 0.00 |
With the emergence of software delivery platforms such as Microsoft's .NET, code compression has become one of the core enabling technologies strongly affecting system performance. In this paper, we present PPMexe - a set of compression mechanisms for executables that explores their syntax and semantics to achieve superior compression rates. The fundament of PPMexe is the generic paradigm of prediction by partial matching (PPM). We combine PPM with two pre-processing steps: instruction rescheduling to improve prediction rates and partitioning of a program binary into streams with high auto-correlation. We improve the traditional PPM algorithm by using: an additional alphabet of frequent variable-length super-symbols extracted from the input stream of fixed-length symbols and a low-overhead mechanism that enables decompression starting from an arbitrary instruction of the executable, a feature pivotal for run-time software delivery. PPMexe was implemented for x86 binaries and tested on several large Microsoft applications. Binaries compressed using PPMexe were 16-23% smaller than files created using PPMD, the best available compressor.