Reduced instruction set computers
Communications of the ACM - Special section on computer architecture
ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
Register allocation & spilling via graph coloring
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
Gprof: A call graph execution profiler
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
Performance of various computers using standard linear equations software in a Fortran environment
ACM SIGARCH Computer Architecture News
A portable machine-independent global optimizer--design and measurements
A portable machine-independent global optimizer--design and measurements
WISQ: a restartable architecture using queues
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
The Mahler experience: using an intermediate language as the machine description
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
A VLIW architecture for a trace scheduling compiler
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
A VLIW architecture for a trace Scheduling Compiler
IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
Register windows vs. register allocation
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Minimizing register usage penalty at procedure calls
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
A portable global optimizer and linker
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
A simple interprocedural register allocation algorithm and its effectiveness for LISP
ACM Transactions on Programming Languages and Systems (TOPLAS)
Data buffering: run-time versus compile-time support
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Using registers to optimize cross-domain call performance
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Available instruction-level parallelism for superscalar and superpipelined machines
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Spill code minimization techniques for optimizing compliers
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Determining average program execution times and their variance
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
IEEE Transactions on Computers
Register allocation across procedure and module boundaries
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Architectural support for reduced register saving/restoring in single-window register files
ACM Transactions on Computer Systems (TOCS)
The interaction of architecture and operating system design
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Mapping concurrent programs to VLIW processors
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Predicting program behavior using real or estimated profiles
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Flexible register management for sequential programs
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
The effect on RISC performance of register set size and structure versus code generation strategy
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Experience with a software-defined machine architecture
ACM Transactions on Programming Languages and Systems (TOPLAS)
Processor Architecture and Data Buffering
IEEE Transactions on Computers
Register relocation: flexible contexts for multithreading
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Accurate static branch prediction by value range propagation
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Interprocedural register allocation for lazy functional languages
FPCA '95 Proceedings of the seventh international conference on Functional programming languages and computer architecture
Proceedings of the 28th annual international symposium on Microarchitecture
Efficient and language-independent mobile programs
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Fast, effective dynamic compilation
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Demand-driven register allocation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Whole-program optimization for time and space efficient threads
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Minimum cost interprocedural register allocation
POPL '96 Proceedings of the 23rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Hot cold optimization of large Windows/NT applications
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Dynamic feedback: an effective technique for adaptive computing
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Call-cost directed register allocation
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Alias analysis of executable code
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Quality and speed in linear-scan register allocation
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Scalable cross-module optimization
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Experiences with Cooperating Register Allocation and Instruction Scheduling
International Journal of Parallel Programming
Java annotation-aware just-in-time (AJIT) complilation system
JAVA '99 Proceedings of the ACM 1999 conference on Java Grande
A Tree-Based Alternative to Java Byte-Codes
International Journal of Parallel Programming
Eliminating synchronization overhead in automatically parallelized programs using dynamic feedback
ACM Transactions on Computer Systems (TOCS)
APRIL: a processor architecture for multiprocessing
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Reducing the cost of branches by using registers
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Generation and analysis of very long address traces
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
The store-load address table and speculative register promotion
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
P-code and compiler portability: experience with a Modula-2 optimizing compiler
ACM SIGPLAN Notices
Optimization of available C compilers for the MC68HC11
ACM-SE 30 Proceedings of the 30th annual Southeast regional conference
Interprocedural register allocation for RISC machines
ACM-SE 30 Proceedings of the 30th annual Southeast regional conference
Reducing Memory Latency via Read-after-Read Memory Dependence Prediction
IEEE Transactions on Computers
Inter-task register-allocation for static operating systems
Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
Performance Tradeoffs in Multithreaded Processors
IEEE Transactions on Parallel and Distributed Systems
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Speculative register promotion using Advanced Load Address Table (ALAT)
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Inter-procedural stacked register allocation for itanium® like architecture
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
The Named-State Register File: Implementation and Performance
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Register windows vs. register allocation
ACM SIGPLAN Notices - Best of PLDI 1979-1999
Profile guided code positioning
ACM SIGPLAN Notices - Best of PLDI 1979-1999
Predicting program behavior using real or estimated profiles
ACM SIGPLAN Notices - Best of PLDI 1979-1999
Binary translation to improve energy efficiency through post-pass register re-allocation
Proceedings of the 4th ACM international conference on Embedded software
Optimal register reassignment for register stack overflow minimization
ACM Transactions on Architecture and Code Optimization (TACO)
Performance and security lessons learned from virtualizing the alpha processor
Proceedings of the 34th annual international symposium on Computer architecture
A practical interprocedural dominance algorithm
ACM Transactions on Programming Languages and Systems (TOPLAS)
The compiler as a static analysis tool
Proceedings of the 2007 ACM international conference on SIGAda annual international conference
Interprocedural Speculative Optimization of Memory Accesses to Global Variables
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Proceedings of the 8th ACM International Conference on Computing Frontiers
Hi-index | 0.01 |
In previous work in global register allocation, the compiler colors a conflict graph constructed from liveness dataflow information, in order to allocate the same register to many variables that are not simultaneously live. If two procedures are in separately compiled modules, however, the compiler must do this allocation separately for each procedure. As a result, the two procedures might use different registers for the same global, or the same register for different locals.We can remove these problems if we delay the register allocation until link time. Our compiler produces object modules that can be linked and run without global register allocation, but includes with each object module a body of information describing how the module uses variables and procedures. A link-time register allocator then decides which variables are used most frequently, selects registers for them, and rewrites the code to reflect the decision that these variables reside in registers rather than in memory. Construction of the call graph allows us to use the same register for locals of procedures that are not simultaneously active, giving us most of the advantages of a full-scale coloring without the expense.When we use our method for 52 registers, our benchmarks speed up by 10 to 25 percent. Even with only 8 registers, the speedup can be nearly that large if we use previously collected profile information to guide the allocation. We cannot do much better, because programs whose variables all fit in registers rarely speed up by more than 30%. Moreover, profiling shows us that we usually remove 60% to 90% of the loads and stores of scalar variables that the program performs during its execution, and often much more.