Fortran RED - A Retargetable Environment for Automatic Data Layout

Authors:
Ulrich Kremer
Affiliations:
-
Venue:
LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
Year:
1998

Citing 33
Cited 0

Solving problems on concurrent processors

Solving problems on concurrent processors
Analysis of interprocedural side effects in a parallel programming environment

Proceedings of the 1st International Conference on Supercomputing
A methodology for parallelizing programs for multicomputers and complex memory multiprocessors

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Updating distributed variables in local computations

Concurrency: Practice and Experience
A static performance estimator to guide data partitioning decisions

PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Global optimizations for parallelism and locality on scalable parallel machines

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Automatic array alignment in data-parallel programs

POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A static parameter based performance prediction tool for parallel programs

ICS '93 Proceedings of the 7th international conference on Supercomputing
Automatic data partitioning on distributed memory multicomputers

Automatic data partitioning on distributed memory multicomputers
CPU performance evaluation and execution time prediction using narrow spectrum benchmarking

CPU performance evaluation and execution time prediction using narrow spectrum benchmarking
Precise compile-time performance prediction for superscalar-based computers

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Theoretical modeling of superscalar processor performance

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
An optimizing Fortran D compiler for MIMD distributed-memory machines

An optimizing Fortran D compiler for MIMD distributed-memory machines
Improving data locality with loop transformations

ACM Transactions on Programming Languages and Systems (TOPLAS)
Minimizing communication while preserving parallelism

ICS '96 Proceedings of the 10th international conference on Supercomputing
Efficient distribution analysis via graph contraction

International Journal of Parallel Programming - Special issue: selected papers from the eighth international workshop on languages and compilers for parallel computing
Compiler techniques for data partitioning of sequentially iterated parallel loops

ICS '90 Proceedings of the 4th international conference on Supercomputing
Computer architecture (2nd ed.): a quantitative approach

Computer architecture (2nd ed.): a quantitative approach
Automatic data layout for distributed-memory machines

ACM Transactions on Programming Languages and Systems (TOPLAS)
Dynamic data distribution with control flow analysis

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Interpreting the performance of HPF/Fortran 90D

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Requirements for Data-Parallel Programming Environments

IEEE Parallel & Distributed Technology: Systems & Technology
Cost-Effective Parallel Computing

Computer
An Implementation of Interprocedural Bounded Regular Section Analysis

IEEE Transactions on Parallel and Distributed Systems
Performance Characterization of Optimizing Compilers

IEEE Transactions on Software Engineering
On Estimating and Enhancing Cache Effectiveness

Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Array Distribution in Data-Parallel Programs

LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Solving Alignment Using Elementary Linear Algebra

LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Automatic data and computation decomposition for distributed memory machines

HICSS '95 Proceedings of the 28th Hawaii International Conference on System Sciences
Compiler techniques for optimizing communication and data distribution for distributed-memory multicomputers

Compiler techniques for optimizing communication and data distribution for distributed-memory multicomputers
Automatic computation and data decomposition for multiprocessors

Automatic computation and data decomposition for multiprocessors
Compiling Efficient Programs for Tightly-Coupled Distributed Memory Computers

ICPP '93 Proceedings of the 1993 International Conference on Parallel Processing - Volume 02
Communication Optimizations Used in the Paradigm Compiler for Distributed-Memory Multicomputers

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 02

Quantified Score

Hi-index	0.00

Visualization

Abstract

The proliferation of parallel platforms over the last ten years has been dramatic. Parallel platforms come in different flavors, including desk-top multiprocessor PCs and workstations with a few processors, networks of PCs and workstations, and supercomputers with hundreds of processors or more. This diverse collection of parallel platforms provide not only computing cycles, but other important resources for scientific computing as well, such as large amounts of main memory and fast I/O capabilities. As a result of the proliferation of parallel platforms, the "typical profile" of a potential user of such systems has changed considerably. The specialist user who has a good understanding of the complexities of the target parallel system has been replaced by a user who is largely unfamiliar with the underlying system characteristics. While the specialist's main concern is peak performance, the non-specialist user may be willing to trade off performance for ease of programming. Recent languages such as High Performance Fortran (HPF) and SGI Parallel Fortran are a significant step towards making parallel platforms truly usable for a broadening user community. However, non-trivial user input is required to produce efficient parallel programs. The main challenge for a user is to understand the performance implications of a specified data layout, which requires knowledge about issues such as code generation and analysis strategies of the HPF compiler and its node compiler, and the performance characteristics of the target architecture. This paper discusses our preliminary experiences with the design and implementation of Fortran RED, a tool that supports Fortran as a deterministic, sequential programming model on different parallel target systems. The tool is not part of a compiler. Fortran RED uses HPF as its intermediate program representation since the language is portable across many parallel platforms, and commercial and research HPF compilers are widely available. Fortran RED is able to support different target HPF compilers and target architectures, and allows multi-dimensional distributions in addition to dynamic remapping. This paper focuses on the discussion of the performance prediction component of the tool and reports preliminary results for a single scientific kernel on two target systems, namely PGI's and IBM's HPF compilers with IBM's SP-2 as the target architecture.