Advanced compiler optimizations for supercomputers
Communications of the ACM - Special issue on parallelism
Two fundamental issues in multiprocessing
4th International DFVLR Seminar on Foundations of Engineering Sciences on Parallel Computing in Science and Engineering
An architecture of a dataflow single chip processor
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
A process-oriented model for efficient execution of dataflow programs
Journal of Parallel and Distributed Computing
Thread-based programming for the EM-4 hybrid dataflow machine
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Computer
Asynchrony in parallel computing: from dataflow to multithreading
Progress in computer research
Asynchrony in parallel computing: from dataflow to multithreading
Progress in computer research
Hi-index | 0.00 |
The EM-4 is a supercomputer that offers very fast interprocessor communication and support for multithreading. In this paper we demonstrate that the EM-4, together with an automatic parallelization technique referred to as Data-Distributed Execution (DDE), offer a computing environment in which large portions of scientific code can be executed without the need for any explicit parallelism.DDE exploits iteration-level parallelism in loops operating over arrays. It performs data-dependency analysis, based on which arrays are distributed over the different local memories. The code is then transformed to “follow” the data distribution by spawning each loop on all PEs concurrently but modifying its boundary conditions so that each operates mostly on the local subranges of the data, thus reducing remote accesses to a minimum. The approach has been tested on the EM-4 by implementing several benchmark programs representative of common scientific applications. The experiments show that high speedup is achievable by automatic parallelization of conventional Fortran-like programs.