Automatic inference of models for statistical code compression

Authors:
Christopher W. Fraser
Affiliations:
Microsoft Research, One Microsoft Way, Redmond, WA
Venue:
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Year:
1999

Citing 11
Cited 15

Text compression

Text compression
Optimizing an ANSI C interpreter with superoperators

POPL '95 Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Elements of machine learning

Elements of machine learning
Data compression for PC software distribution

Software—Practice & Experience
Code compression

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Tailored compression of Java class files

Software—Practice & Experience
A Retargetable C Compiler: Design and Implementation

A Retargetable C Compiler: Design and Implementation
Empirical analysis of the mesa instruction set

ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
A Bayesian approach to learning Bayesian networks with local structure

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
On the Complexity of Finite Sequences

IEEE Transactions on Information Theory
Compression of individual sequences via variable-rate coding

IEEE Transactions on Information Theory

Compiler-driven cached code compression schemes for embedded ILP processors

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Reducing transfer delay using Java class file splitting and prefetching

Proceedings of the 14th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Split-stream dictionary program compression

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Bytecode compression via profiled grammar rewriting

Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Combining Global Code and Data Compaction

OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
Sifting out the mud: low level C++ code reuse

OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Code optimization for code compression

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
PPMexe: PPM for Compressing Software

DCC '02 Proceedings of the Data Compression Conference
Survey of code-size reduction methods

ACM Computing Surveys (CSUR)
Link-time binary rewriting techniques for program compaction

ACM Transactions on Programming Languages and Systems (TOPLAS)
Post-compilation optimization for multiple gains with pattern matching

ACM SIGPLAN Notices
PPMexe: Program compression

ACM Transactions on Programming Languages and Systems (TOPLAS)
Link-time compaction and optimization of ARM executables

ACM Transactions on Embedded Computing Systems (TECS)
Automated reduction of the memory footprint of the Linux kernel

ACM Transactions on Embedded Computing Systems (TECS) - Special Section LCTES'05
Compressing XML documents using recursive finite state automata

CIAA'05 Proceedings of the 10th international conference on Implementation and Application of Automata

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes experiments that apply machine learning to compress computer programs, formalizing and automating decisions about instruction encoding that have traditionally been made by humans in a more ad hoc manner. A program accepts a large training set of program material in a conventional compiler intermediate representation (IR) and automatically infers a decision tree that separates IR code into streams that compress much better than the undifferentiated whole. Driving a conventional arithmetic compressor with this model yields code 30% smaller than the previous record for IR code compression, and 24% smaller than an ambitious optimizing compiler feeding an ambitious general-purpose data compressor.