Performance optimization by dynamic code transformation

Authors:
Josef Weidendorfer;Tilman Küstner;Sally A. McKee
Affiliations:
Technische Universität München, Germany;Technische Universität München, Germany;Chalmers University of Technology, Sweden
Venue:
Proceedings of the 8th ACM International Conference on Computing Frontiers
Year:
2011

Citing 5
Cited 0

Shade: a fast instruction-set simulator for execution profiling

SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
A fast Fourier transform compiler

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Dynamo: a transparent dynamic optimization system

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
A brief history of just-in-time

ACM Computing Surveys (CSUR)
Parallel MLEM on Multicore Architectures

ICCS '09 Proceedings of the 9th International Conference on Computational Science: Part I

Quantified Score

Hi-index	0.01

Visualization

Abstract

Even parts of a program that are sequential or just inherently difficult to parallelize can be optimized for ILP. For instance, eliminating loop overheads and potential pipeline stalls from control flow can alleviate performance bottle-necks. Unfortunately, static compilation is limited in the extent to which it can identify opportunities to apply such optimizations. Generating code dynamically at run time, however, create much more efficient applications by usin information not available at compile time. We demonstrate our approach on a sparse-matrix PET scan code by aggressive unrolling loops and specializing code via dynamic code generation. We leverage task-level parallelism by having an auxiliary processor core concurrently generate code and feed it to the core executing the application. Our approach to fast code generation leverages patching and concatenating prepared code skeletons.