The parallel execution of DO loops
Communications of the ACM
The Hyperplane Method for an Array Computer
Proceedings of the Sagamore Computer Conference on Parallel Processing
Compiling Fortran 8x array features for the connection machine computer system
PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
Cedar Fortran and other Vector and parallel Fortran dialects
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Compiler transformations for high-performance computing
ACM Computing Surveys (CSUR)
Program Improvement by Source-to-Source Transformation
Journal of the ACM (JACM)
The ILLIAC IV FORTRAN compiler
Proceedings of the conference on Programming languages and compilers for parallel and vector machines
Multiprocessor software design
ACM '80 Proceedings of the ACM 1980 annual conference
Hi-index | 0.00 |
The ILLIAC IV Fortran compiler's Parallelism Analyzer and Synthesizer (mnemonicized as the Paralyzer) detects computations in Fortran DO loops which can be performed in parallel. It is a step of the compiling process which lies between source language parsing and target code generation, and as such can be considered as a high-level optimization step specific to the ILLIAC architecture. The Paralyzer performs its transformations within the Intermediate Language tables of the compiler. The parallel execution constructs introduced into the user's program are those which can be expressed in the extended Fortran language, IVTRAN, the source language of the compiler [1]. With a decompiler from the Intermediate Language to IVTRAN source, the Paralyzer can act as a source-to-source translater. Some pertinent characteristics of the ILLIAC IV motivate the parallelism detection methods employed by the Paralyzer. ILLIAC is in the general class of parallel processors known as array processors. That is to say, it performs identical computations in a lock-step, synchronous fashion over separate data streams. Its computational access to main memory is highly constrained: each of the 64 Processing Units can access directly only a private section of the whole memory. Data can be passed from one Processing Unit to another by a relatively expensive routing instruction. This is executed identically by all Processing Units and passes data a uniform end-around distance in the fixed ordering of the Processing Units. The machine executes most efficiently those computations which are element-by-element operations on vectors or arrays. Thus, the most fruitful sources of parallelism in Fortran programs intended for ILLIAC IV execution are DO loops containing array references with subscripts depending on the DO index variables.