Spike: an optimizer for alpha/NT executables

Authors:
Robert Cohn;David Goodwin;P. Geoffrey Lowney;Norman Rubin
Affiliations:
Digital Equipment Corporation;Digital Equipment Corporation;Digital Equipment Corporation;Digital Equipment Corporation
Venue:
NT'97 Proceedings of the USENIX Windows NT Workshop on The USENIX Windows NT Workshop 1997
Year:
1997

Citing 9
Cited 27

Achieving high instruction cache performance with an optimizing compiler

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Profile guided code positioning

PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
The multiflow trace scheduling compiler

The Journal of Supercomputing - Special issue on instruction-level parallelism
Link-time optimization of address calculation on a 64-bit architecture

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
ATOM: a system for building customized program analysis tools

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Delivering binary object modification tools for program tools for program analysis and optimization

Digital Technical Journal
Hot cold optimization of large Windows/NT applications

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Interprocedural dataflow analysis in an executable optimizer

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
DIGITAL FX!32: Running 32-Bit x86 Applications on Alpha NT

COMPCON '97 Proceedings of the 42nd IEEE International Computer Conference

Continuous profiling: where have all the cycles gone?

ACM Transactions on Computer Systems (TOCS)
Continuous profiling: where have all the cycles gone?

Proceedings of the sixteenth ACM symposium on Operating systems principles
Alias analysis of executable code

POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Impact of economics on compiler optimization

Proceedings of the 2001 joint ACM-ISCOPE conference on Java Grande
Sifting out the mud: low level C++ code reuse

OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
FX!32: A Profile-Directed Binary Translator

IEEE Micro
Speculative Alias Analysis for Executable Code

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Load Redundancy Elimination on Executable Code

Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Fetching instruction streams

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Static program analysis of embedded executable assembly code

Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Software Trace Cache

IEEE Transactions on Computers
A first look at the interplay of code reordering and configurable caches

GLSVLSI '05 Proceedings of the 15th ACM Great Lakes symposium on VLSI
System-wide compaction and specialization of the linux kernel

LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Link-time binary rewriting techniques for program compaction

ACM Transactions on Programming Languages and Systems (TOPLAS)
Fast and efficient partial code reordering: taking advantage of dynamic recompilatior

Proceedings of the 5th international symposium on Memory management
Dynamic code management: improving whole program code locality in managed runtimes

Proceedings of the 2nd international conference on Virtual execution environments
Bidirectional liveness analysis, or how less than half of the alpha's registers are used

Journal of Systems Architecture: the EUROMICRO Journal
Link-time compaction and optimization of ARM executables

ACM Transactions on Embedded Computing Systems (TECS)
Code reordering on limited branch offset

ACM Transactions on Architecture and Code Optimization (TACO)
HP caliper: an architecture for performance analysis tools

WIESS'00 Proceedings of the 1st conference on Industrial Experiences with Systems Software - Volume 1
Evaluating the importance of user-specific profiling

WINSYM'98 Proceedings of the 2nd conference on USENIX Windows NT Symposium - Volume 2
Automated reduction of the memory footprint of the Linux kernel

ACM Transactions on Embedded Computing Systems (TECS) - Special Section LCTES'05
Enlarging Instruction Streams

IEEE Transactions on Computers
A latency-conscious SMT branch prediction architecture

International Journal of High Performance Computing and Networking
Automatic Parallelization in a Binary Rewriter

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Issues and support for dynamic register allocation

ACSAC'06 Proceedings of the 11th Asia-Pacific conference on Advances in Computer Systems Architecture
Combining code reordering and cache configuration

ACM Transactions on Embedded Computing Systems (TECS)

Quantified Score

Hi-index	0.01

Visualization

Abstract

Spike is a profile-directed optimizer for Alpha/NT executables that is actively being used to optimize shipping products. Spike consists of the Spike Optimization Environment (SOE) and the Spike Optimizer. Through both a graphical interface and a command-line interface, the Spike Optimization Environment provides a simple means to instrument and optimize large applications consisting of many images. SOE manages the instrumented and optimized images as well as any profile information collected for those images, freeing the user from many tedious and error-prone tasks typically associated with profile-directed optimization. SOE also simplifies the collection of profile information with Transparent Application Substitution (TAS). With TAS, the user invokes the original version of the application and the instrumented or optimized version of the application is transparently executed in its. SOE uses the Spike Optimizer to optimize images. The Spike Optimizer performs code layout to improve instruction cache behavior [Pettis90], hot cold optimization [Cohn96] and register allocation. The optimizations are targeted at large call-intensive applications, where loops span multiple routines, and each routine contains complex control-flow. For this class of applications, Spike provides significant performance improvement, reducing execution time by as much as 20%.