Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Efficiently computing static single assignment form and the control dependence graph
ACM Transactions on Programming Languages and Systems (TOPLAS)
The optimization and parallelization of array language programs
The optimization and parallelization of array language programs
Compiler transformations for high-performance computing
ACM Computing Surveys (CSUR)
An array operation synthesis scheme to optimize Fortran 90 programs
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
An HPF compiler for the IBM SP2
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Fusion of Loops for Parallelism and Locality
IEEE Transactions on Parallel and Distributed Systems
Using C++ template metaprograms
C++ gems
Array SSA form and its use in parallelization
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Optimizing Fortran90D/HPF for distributed-memory computers
Optimizing Fortran90D/HPF for distributed-memory computers
The implementation and evaluation of fusion and contraction in array languages
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Loop fusion in high performance Fortran
ICS '98 Proceedings of the 12th international conference on Supercomputing
A function-composition approach to synthesize Fortran 90 array operations
Journal of Parallel and Distributed Computing
An APL Compiler for a Vector Processor
ACM Transactions on Programming Languages and Systems (TOPLAS)
Array operation synthesis to optimize HPF programs on distributed memory machines
Journal of Parallel and Distributed Computing
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
Compiling stencils in high performance Fortran
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Compilation and delayed evaluation in APL
POPL '78 Proceedings of the 5th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
International Journal of Parallel Programming
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
Factor-Join: A Unique Approach to Compiling Array Languages for Parallel Machines
LCPC '96 Proceedings of the 9th International Workshop on Languages and Compilers for Parallel Computing
Collective Loop Fusion for Array Contraction
Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Optimizing Fortran 90 Shift Operations on Distributed-Memory Multicomputers
LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
Optimization of Data-Parallel Field Expressions in the POOMA Framework
ISCOPE '97 Proceedings of the Scientific Computing in Object-Oriented Parallel Environments
An apl machine
An algebraic programming style for numerical software and its optimization
Scientific Programming
Hi-index | 0.00 |
We consider the analysis and optimization of code utilizing operations and functions operating on entire arrays. Models are developed for studying the minimization of the number of materializations of array-valued temporaries in basic blocks, each consisting of a sequence of assignment statements involving array-valued variables. We derive lower bounds on the number of materializations required, and develop several algorithms minimizing the number of materializations, subject to a simple constraint on allowable statement rearrangement. In contrast, we also show that when statement rearrangement is unconstrained, minimizing the number of materializations becomes NP-complete, even for very simple basic blocks.