Advanced compiler optimizations for supercomputers
Communications of the ACM - Special issue on parallelism
A vectorizing Fortran compiler
IBM Journal of Research and Development
Automatic translation of FORTRAN programs to vector form
ACM Transactions on Programming Languages and Systems (TOPLAS)
Vectorizing compilers: a test suite and results
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Improving register allocation for subscripted variables
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
A Survey of Parallel Machine Organization and Programming
ACM Computing Surveys (CSUR)
The parallel execution of DO loops
Communications of the ACM
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
Conversion of control dependence to data dependence
POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Dependence graphs and compiler optimizations
POPL '81 Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
Fortran for the Texas Instruments ASC system
Proceedings of the conference on Programming languages and compilers for parallel and vector machines
Dependence analysis for subscripted variables and its application to program transformations
Dependence analysis for subscripted variables and its application to program transformations
An automated approach to improve communication-computation overlap in clusters
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Loop acceleration exploration for ASIP architecture
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
Parallel and vector machines are becoming increasingly important to many computation intensive applications. Effectively utilizing such architectures, particularly from sequential languages such as Fortran, has demanded increasingly sophisticated compilers. In general, a compiler needs to significantly reorder a program in order to generate code optimal for a specific architecture.Because DO loops typically control the execution of a number of statements, the order in which loops are executed can dramatically affect the performance of a machine on a particular section of code. In particular, loop interchange can often be used to enhance the performance of code on parallel or vector machines.Determining when loops may be safely and profitably interchanged requires a study of the data dependences in the program. This work discusses specific applications of that theory to loop interchange. This theory is described as it has been implemented in PFC (Parallel Fortran Converter) -- a program which attempts to uncover operations in sequential Fortran code that may be safely rewritten as vector operations.