Programming in VS Fortran on the IBM 3090 for Maximum Vector Performance

Authors:
Bowen Liu;Nelson Strother
Affiliations:
-;-
Venue:
Computer
Year:
1988

Citing 6
Cited 10

A vectorizing Fortran compiler

IBM Journal of Research and Development
Squeezing the most out of an algorithm in CRAY FORTRAN

ACM Transactions on Mathematical Software (TOMS)
Storage reorganization techniques for matrix computation in a paging environment

Communications of the ACM
Organizing matrices and matrix operations for paged memory systems

Communications of the ACM
Automatic loop interchange

SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
Structure of Computers and Computations

Structure of Computers and Computations

IBM parallel FORTRAN

IBM Systems Journal
Tuning the rank-n update in a wavefront solver for peak performance

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
A parallel algorithm for the quadratic assignment problem

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Cache considerations for multiprocessor programmers

Communications of the ACM
The impact of memory organization on the performance of matrix multiplication

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Multiplication of a symmetric banded matrix by a vector on a vector multiprocessor computer

IBM Journal of Research and Development
A proposal of Level 3 interface for band and skyline matrix factorization subroutine

ICS '93 Proceedings of the 7th international conference on Supercomputing
Influence of the stride on the cache utilization in the IBM 3090 VF

ICS '89 Proceedings of the 3rd international conference on Supercomputing
The impact of memory organization on the performance of matrix calculations

Parallel Computing
Sparse matrix vector multiplication techniques on the IBM 3090 VF

Parallel Computing

Quantified Score

Hi-index	4.12

Visualization

Abstract

Programming techniques necessary for high performance on the 3090 Vector Facilities are illustrated, showing that VS Fortran programs can achieve near-maximum execution rates. Relevant features of the 3090 architecture are reviewed, stressing the need to make efficient use of a hierarchical storage system and take advantage of the compound vector instructions. The key programming techniques for managing the storage hierarchy are loop sectioning, loop distribution, and data compaction. Vector register, cache reuse, and virtual memory, storage format, and page reuse are shown to lead to efficient use of the vector registers, the high speed cache, and the virtual memory system, respectively. The multiply-and-add compound instruction is discussed.