Interprocedural dependence analysis and parallelization
SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Automatic translation of FORTRAN programs to vector form
ACM Transactions on Programming Languages and Systems (TOPLAS)
A global approach to detection of parallelism
A global approach to detection of parallelism
Prototyping Fortran-90 compilers for massively parallel machines
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
IEEE Transactions on Computers
Compiler transformations for high-performance computing
ACM Computing Surveys (CSUR)
Optimizing Fortran90D/HPF for distributed-memory computers
Optimizing Fortran90D/HPF for distributed-memory computers
The implementation and evaluation of fusion and contraction in array languages
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Loop fusion in high performance Fortran
ICS '98 Proceedings of the 12th international conference on Supercomputing
Optimizing Supercompilers for Supercomputers
Optimizing Supercompilers for Supercomputers
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
A hierarchical basis for reordering transformations
POPL '84 Proceedings of the 11th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
A Case Study of Some Issues in the Optimization of Fortran 90 Array Notation
Scientific Programming
Hi-index | 0.00 |
One task of all Fortran 90 compilers is to scalarize the array syntax statements of a program into equivalent sequential code. Most compilers require multiple passes over the program source to ensure correctness of this translation, since their analysis algorithms only work on the scalarized form. These same compilers then make additional subsequent passes to perform loop optimizations such as loop fusion. In this paper we discuss a strategy that is capable of making advanced scalarization and fusion decisions at the array level. We present an analysis strategy that supports our advanced scalarizer, and we describe the benefits of this methodology compared to the standard practice. Experimental results show that our strategy can significantly improve the runtime performance of compiled code, while at the same time improving the performance of the compiler itself.