Achieving high instruction cache performance with an optimizing compiler
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Profile guided code positioning
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
The multiflow trace scheduling compiler
The Journal of Supercomputing - Special issue on instruction-level parallelism
Link-time optimization of address calculation on a 64-bit architecture
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
ATOM: a system for building customized program analysis tools
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Delivering binary object modification tools for program tools for program analysis and optimization
Digital Technical Journal
Hot cold optimization of large Windows/NT applications
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Interprocedural dataflow analysis in an executable optimizer
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
DIGITAL FX!32: Running 32-Bit x86 Applications on Alpha NT
COMPCON '97 Proceedings of the 42nd IEEE International Computer Conference
Continuous profiling: where have all the cycles gone?
ACM Transactions on Computer Systems (TOCS)
Continuous profiling: where have all the cycles gone?
Proceedings of the sixteenth ACM symposium on Operating systems principles
Alias analysis of executable code
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Impact of economics on compiler optimization
Proceedings of the 2001 joint ACM-ISCOPE conference on Java Grande
Sifting out the mud: low level C++ code reuse
OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
FX!32: A Profile-Directed Binary Translator
IEEE Micro
Speculative Alias Analysis for Executable Code
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Load Redundancy Elimination on Executable Code
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Static program analysis of embedded executable assembly code
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
IEEE Transactions on Computers
A first look at the interplay of code reordering and configurable caches
GLSVLSI '05 Proceedings of the 15th ACM Great Lakes symposium on VLSI
System-wide compaction and specialization of the linux kernel
LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Link-time binary rewriting techniques for program compaction
ACM Transactions on Programming Languages and Systems (TOPLAS)
Fast and efficient partial code reordering: taking advantage of dynamic recompilatior
Proceedings of the 5th international symposium on Memory management
Dynamic code management: improving whole program code locality in managed runtimes
Proceedings of the 2nd international conference on Virtual execution environments
Bidirectional liveness analysis, or how less than half of the alpha's registers are used
Journal of Systems Architecture: the EUROMICRO Journal
Link-time compaction and optimization of ARM executables
ACM Transactions on Embedded Computing Systems (TECS)
Code reordering on limited branch offset
ACM Transactions on Architecture and Code Optimization (TACO)
HP caliper: an architecture for performance analysis tools
WIESS'00 Proceedings of the 1st conference on Industrial Experiences with Systems Software - Volume 1
Evaluating the importance of user-specific profiling
WINSYM'98 Proceedings of the 2nd conference on USENIX Windows NT Symposium - Volume 2
Automated reduction of the memory footprint of the Linux kernel
ACM Transactions on Embedded Computing Systems (TECS) - Special Section LCTES'05
IEEE Transactions on Computers
A latency-conscious SMT branch prediction architecture
International Journal of High Performance Computing and Networking
Automatic Parallelization in a Binary Rewriter
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Issues and support for dynamic register allocation
ACSAC'06 Proceedings of the 11th Asia-Pacific conference on Advances in Computer Systems Architecture
Combining code reordering and cache configuration
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.01 |
Spike is a profile-directed optimizer for Alpha/NT executables that is actively being used to optimize shipping products. Spike consists of the Spike Optimization Environment (SOE) and the Spike Optimizer. Through both a graphical interface and a command-line interface, the Spike Optimization Environment provides a simple means to instrument and optimize large applications consisting of many images. SOE manages the instrumented and optimized images as well as any profile information collected for those images, freeing the user from many tedious and error-prone tasks typically associated with profile-directed optimization. SOE also simplifies the collection of profile information with Transparent Application Substitution (TAS). With TAS, the user invokes the original version of the application and the instrumented or optimized version of the application is transparently executed in its. SOE uses the Spike Optimizer to optimize images. The Spike Optimizer performs code layout to improve instruction cache behavior [Pettis90], hot cold optimization [Cohn96] and register allocation. The optimizations are targeted at large call-intensive applications, where loops span multiple routines, and each routine contains complex control-flow. For this class of applications, Spike provides significant performance improvement, reducing execution time by as much as 20%.