Interprocessor communication speed and performance in distributed-memory parallel processors

Authors:
M. Annaratone;C. Pommerell;R. Rühl
Affiliations:
Integrated Systems Labomtory, Swiss Federal Institute of Technology, 8092 Zurich, Switzerland;Integrated Systems Labomtory, Swiss Federal Institute of Technology, 8092 Zurich, Switzerland;Integrated Systems Labomtory, Swiss Federal Institute of Technology, 8092 Zurich, Switzerland
Venue:
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Year:
1989

Citing 19
Cited 9

Memory requirements for balanced computer architectures

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Architecture of a message-driven processor

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Stencils and problem partitionings: their influence on the performance of multiple processor systems

IEEE Transactions on Computers
Communication effect basic linear algebra computations on hypercube architectures

Journal of Parallel and Distributed Computing
Reduction of the effects of the communication delays in scientific algorithms on message passing MIMD architectures

SIAM Journal on Scientific and Statistical Computing - Papers from the Second Conference on Parallel Processing for Scientific Computin
Hypercube algorithms and implementations

SIAM Journal on Scientific and Statistical Computing
Nearest-neighbor mapping of finite element graphs onto processor meshes

IEEE Transactions on Computers
The warp computer: Architecture, implementation, and performance

IEEE Transactions on Computers
Parallel solution of triangular systems on distributed-memory multiprocessors

SIAM Journal on Scientific and Statistical Computing
Modified cyclic algorithms for solving triangular systems on distributed-memory multiprocessors

SIAM Journal on Scientific and Statistical Computing
Problem size, parallel architecture, and optimal speedup

Journal of Parallel and Distributed Computing
Transputer reference manual

Transputer reference manual
Iterative Algorithms for Solution of Large Sparse Systems of Linear Equations on Hypercubes

IEEE Transactions on Computers
Warp: an integrated solution of high-speed parallel computing

Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Locality, communication, and interconnect length in multicomputers

SIAM Journal on Computing
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms

The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
Basic Linear Algebra Subprograms for Fortran Usage

ACM Transactions on Mathematical Software (TOMS)
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
A proposal for an extended set of Fortran Basic Linear Algebra Subprograms

ACM SIGNUM Newsletter

K9: a simulator of distributed-memory parallel processors

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
A static performance estimator to guide data partitioning decisions

PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
The K2 distributed memory parallel processor: architecture, compiler, and operating system

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Improving AP1000 parallel computer performance with message communication

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Performance prediction of parallel processing systems: the PAMELA methodology

ICS '93 Proceedings of the 7th international conference on Supercomputing
Parallelization of FORTRAN code on distributed-memory parallel processors

ICS '90 Proceedings of the 4th international conference on Supercomputing
Parallel distributed viewshed analysis

Proceedings of the 6th ACM international symposium on Advances in geographic information systems
The K2 parallel processor: architecture and hardware implementation

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Processing Element Design for a Parallel Computer

IEEE Micro

Quantified Score

Hi-index	0.00

Visualization

Abstract

We have simulated several numerical and non-numerical algorithms on five distributed-memory parallel processors (DMPPs). All five DMPPs have the same topology (a torus), and the same number of nodes. The architectures differ only in the communication speed between neighboring nodes, while the computation unit is kept unchanged. The goal of the paper is to quantify the effect that interprocessor communication speed and synchronization overhead have on the performance of the DMPPs. After introducing the rationale for this study and reviewing related work, we present and discuss the results of the simulations.