Interprocedural dependence analysis and parallelization
SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Automatic translation of FORTRAN programs to vector form
ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatic decomposition of scientific programs for parallel execution
POPL '87 Proceedings of the 14th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Strategies for cache and local memory management by global program transformation
Proceedings of the 1st International Conference on Supercomputing
Testing for the Church-Rosser Property
Journal of the ACM (JACM)
Code Generation for a One-Register Machine
Journal of the ACM (JACM)
Communications of the ACM - Special issue on computer architecture
Register allocation by priority-based coloring
SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
A portable optimizing compiler for Modula-2
SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
Conversion of control dependence to data dependence
POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Dependence graphs and compiler optimizations
POPL '81 Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Structure of Computers and Computations
Structure of Computers and Computations
LFP '84 Proceedings of the 1984 ACM Symposium on LISP and functional programming
An analysis of the Cray-1 computer
ISCA '78 Proceedings of the 5th annual symposium on Computer architecture
Register allocation & spilling via graph coloring
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
The history of FORTRAN I, II, and III
ACM SIGPLAN Notices - Special issue: History of programming languages conference
Fortran for the Texas Instruments ASC system
Proceedings of the conference on Programming languages and compilers for parallel and vector machines
Improving the performance of virtual memory computers.
Improving the performance of virtual memory computers.
Dependence analysis for subscripted variables and its application to program transformations
Dependence analysis for subscripted variables and its application to program transformations
Optimizing supercompilers for supercomputers
Optimizing supercompilers for supercomputers
Unified compilation of Fortran 77D and 90D
ACM Letters on Programming Languages and Systems (LOPLAS)
Memory data organization for improved cache performance in embedded processor applications
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Loop fusion in high performance Fortran
ICS '98 Proceedings of the 12th international conference on Supercomputing
Proceedings of the 14th international conference on Supercomputing
Improving Memory Traffic by Assembly-Level Exploitation of Reuses for Vector Registers
The Journal of Supercomputing
Power and Speed-Efficient Code Transformation of Video Compression Algorithms for RISC Processors
Journal of VLSI Signal Processing Systems - Special issue on multimedia signal processing
International Journal of Parallel Programming
Compilation Techniques for Multimedia Processors
International Journal of Parallel Programming
Improving Effective Bandwidth through Compiler Enhancement of Global Cache Reuse
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Advanced Scalarization of Array Syntax
CC '00 Proceedings of the 9th International Conference on Compiler Construction
Improving effective bandwidth through compiler enhancement of global cache reuse
Journal of Parallel and Distributed Computing
Improving register allocation for subscripted variables
ACM SIGPLAN Notices - Best of PLDI 1979-1999
The Energy Impact of Aggressive Loop Fusion
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
A case for a working-set-based memory hierarchy
Proceedings of the 2nd conference on Computing frontiers
Energy minimization with loop fusion and multi-functional-unit scheduling for multidimensional DSP
Journal of Parallel and Distributed Computing
Redundancy elimination revisited
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Exploiting loop-dependent stream reuse for stream processors
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Dependence-based code generation for a CELL processor
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Exploiting the reuse supplied by loop-dependent stream references for stream processors
ACM Transactions on Architecture and Code Optimization (TACO)
Embedded Systems Design
Hi-index | 14.98 |
The problem of allocating vector registers on supercomputers is addressed in the context of compiling vector languages. Two subproblems must be solved to achieve good vector register allocation. First, the vector operations in the source program must be subdivided into sections that fit the hardware of the target machine. Second, the locality of reference of the vector operations must be improved via aggressive program transformations. Solutions to both of these problems, based on the use of novel aspects of data dependence, are presented. The techniques described extend naturally to scalar machines by observing that a scalar register is simply a vector register of length one.