Memory requirements for balanced computer architectures
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Architecture of a message-driven processor
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Stencils and problem partitionings: their influence on the performance of multiple processor systems
IEEE Transactions on Computers
Communication effect basic linear algebra computations on hypercube architectures
Journal of Parallel and Distributed Computing
SIAM Journal on Scientific and Statistical Computing - Papers from the Second Conference on Parallel Processing for Scientific Computin
Hypercube algorithms and implementations
SIAM Journal on Scientific and Statistical Computing
Nearest-neighbor mapping of finite element graphs onto processor meshes
IEEE Transactions on Computers
The warp computer: Architecture, implementation, and performance
IEEE Transactions on Computers
Parallel solution of triangular systems on distributed-memory multiprocessors
SIAM Journal on Scientific and Statistical Computing
Modified cyclic algorithms for solving triangular systems on distributed-memory multiprocessors
SIAM Journal on Scientific and Statistical Computing
Problem size, parallel architecture, and optimal speedup
Journal of Parallel and Distributed Computing
Transputer reference manual
Iterative Algorithms for Solution of Large Sparse Systems of Linear Equations on Hypercubes
IEEE Transactions on Computers
Warp: an integrated solution of high-speed parallel computing
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Locality, communication, and interconnect length in multicomputers
SIAM Journal on Computing
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
Basic Linear Algebra Subprograms for Fortran Usage
ACM Transactions on Mathematical Software (TOMS)
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
A proposal for an extended set of Fortran Basic Linear Algebra Subprograms
ACM SIGNUM Newsletter
K9: a simulator of distributed-memory parallel processors
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
A static performance estimator to guide data partitioning decisions
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
The K2 distributed memory parallel processor: architecture, compiler, and operating system
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Improving AP1000 parallel computer performance with message communication
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Performance prediction of parallel processing systems: the PAMELA methodology
ICS '93 Proceedings of the 7th international conference on Supercomputing
Parallelization of FORTRAN code on distributed-memory parallel processors
ICS '90 Proceedings of the 4th international conference on Supercomputing
Parallel distributed viewshed analysis
Proceedings of the 6th ACM international symposium on Advances in geographic information systems
The K2 parallel processor: architecture and hardware implementation
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Hi-index | 0.00 |
We have simulated several numerical and non-numerical algorithms on five distributed-memory parallel processors (DMPPs). All five DMPPs have the same topology (a torus), and the same number of nodes. The architectures differ only in the communication speed between neighboring nodes, while the computation unit is kept unchanged. The goal of the paper is to quantify the effect that interprocessor communication speed and synchronization overhead have on the performance of the DMPPs. After introducing the rationale for this study and reviewing related work, we present and discuss the results of the simulations.